

Video length is determined by 【audio duration】
Automatically syncs with the audio lip movements.
Ultra-fast optimized version, faster speed 【low resolution sampling once, upscaled sampling twice】,
1920*1088 Total 161 frames, takes about 5 minutes,
1920*1088 Total 251 frames, takes about 6 minutes,
2560*1440 Total 161 frames, takes about 6 minutes,
2560*1440 Total 251 frames, takes about 10 minutes,
24G VRAM is sufficient to run, 48G VRAM is recommended,
AI Applications:
Supports text-to-video, first frame, last frame, first and last frames, first-middle-last frames video generation, enable the corresponding images as needed.
Supports automatic/manual prompt switching
Workflow:
Nodes in fluorescent green are input parameters, adjust as needed
Use the pink switch for function toggling
Output 【a video segment prompt text file】
👇 (for use in combination) Qwen3 TTS 2-person/single-person dialogue audio generation (supports uploading cloned voice or text description to generate voice)
https://www.runninghub.cn/post/2018391521661292545?inviteCode=ishbfzc1
👇 (for use in combination) Qwen3 TTS 8-person or fewer dialogue audio generation (supports uploading cloned voice or text description to generate voice)
https://www.runninghub.cn/post/2017578327875264513?inviteCode=ishbfzc1
👇 (for use in combination) HeartMuLa enhanced song generation with timbre replacement (includes automatic lyric generation)
https://www.runninghub.cn/post/2016577029856043009?inviteCode=ishbfzc1
👇 LTX2.0 first and last frame video generation V3 ultra-fast optimized version (non-audio-driven version)_three times sampling
https://www.runninghub.cn/post/2017376776682475522?inviteCode=ishbfzc1
Video length is determined by 【audio duration】
Automatically syncs with the audio lip movements.
Ultra-fast optimized version, faster speed 【low resolution sampling once, upscaled sampling twice】,
1920*1088 Total 161 frames, takes about 5 minutes,
1920*1088 Total 251 frames, takes about 6 minutes,
2560*1440 Total 161 frames, takes about 6 minutes,
2560*1440 Total 251 frames, takes about 10 minutes,
24G VRAM is sufficient to run, 48G VRAM is recommended,
AI Applications:
Supports text-to-video, first frame, last frame, first and last frames, first-middle-last frames video generation, enable the corresponding images as needed.
Supports automatic/manual prompt switching
Workflow:
Nodes in fluorescent green are input parameters, adjust as needed
Use the pink switch for function toggling
Output 【a video segment prompt text file】
👇 (for use in combination) Qwen3 TTS 2-person/single-person dialogue audio generation (supports uploading cloned voice or text description to generate voice)
https://www.runninghub.cn/post/2018391521661292545?inviteCode=ishbfzc1
👇 (for use in combination) Qwen3 TTS 8-person or fewer dialogue audio generation (supports uploading cloned voice or text description to generate voice)
https://www.runninghub.cn/post/2017578327875264513?inviteCode=ishbfzc1
👇 (for use in combination) HeartMuLa enhanced song generation with timbre replacement (includes automatic lyric generation)
https://www.runninghub.cn/post/2016577029856043009?inviteCode=ishbfzc1
👇 LTX2.0 first and last frame video generation V3 ultra-fast optimized version (non-audio-driven version)_three times sampling
https://www.runninghub.cn/post/2017376776682475522?inviteCode=ishbfzc1