Upload 1 image, use this image to reverse-engineer the prompt, and it can generate a 16-second-long video. At 512 resolution, the runtime is about 6 minutes and consumes around 70 tokens.