Generate images from sketches, edges, poses, and depth maps
Generate edited video frames using text prompts