
A 15 second music video production only requires one character reference image, one song of over 15 seconds, and three music video descriptions. We used Infinite Talk and the Humo model.
Local deployment requires ensuring that the following models are installed:
1. Humo (KJ Quantitative Version)
https://huggingface.co/Kijai/WanVideo_comfy/tree/main/HuMo
Place folder:
whisper_large_v3_encoder_fp16.safetensors
models/audio_encoders
Wan2_1-HuMo-14B_fp16.safetensors
models\diffusion_models
2.InfiniteTalk
https://huggingface.co/MeiGen-AI/InfiniteTalk/tree/main/comfyui
Placement location: models/diffusion models
A 15 second music video production only requires one character reference image, one song of over 15 seconds, and three music video descriptions. We used Infinite Talk and the Humo model.
Local deployment requires ensuring that the following models are installed:
1. Humo (KJ Quantitative Version)
https://huggingface.co/Kijai/WanVideo_comfy/tree/main/HuMo
Place folder:
whisper_large_v3_encoder_fp16.safetensors
models/audio_encoders
Wan2_1-HuMo-14B_fp16.safetensors
models\diffusion_models
2.InfiniteTalk
https://huggingface.co/MeiGen-AI/InfiniteTalk/tree/main/comfyui
Placement location: models/diffusion models