To ensure the consistency of characters, try to avoid situations where the face is not visible in the shots, such as large head turns, squatting, etc.

Additionally, the second round of frame extraction defaults to extracting the last frame, but some prompts may cause the last frame's face to be blurry. In such cases, select a frame earlier, and make sure to pay attention!

Theoretically, the shot can be extended indefinitely, but it heavily depends on the prompts, and proper adjustments are needed!!


num_frames: Total number of frames, usually no more than 85 frames
The width and height ratio should follow the original video reference; do not turn landscape dimensions into portrait dimensions.
width is the width
height is the height

Make sure the above two node dimensions are set to the same.

The default resolution used is a large resolution of 768*512, which can also be portrait 512*768, estimated to take 9 minutes.
If you feel it's too long, change it to 512*384 or 384*512 for much faster speed.



Bilibili video: https://www.bilibili.com/video/BV1fvP7e2EVA/

YouTube tutorial: https://www.youtube.com/watch?v=mFIWRMmuwGs

Knowledge Planet: https://t.zsxq.com/7F90A

Text and image tutorial: https://t.zsxq.com/vKMqo