
A 15-second MV production requires only one character reference image, a song longer than 15 seconds, and three MV descriptions. Utilizes the infinite Talk and Humo models.
Local deployment requires the following models to be installed:
1. Humo (KJ Quantized Version)
https://huggingface.co/Kijai/WanVideo_comfy/tree/main/HuMo
Folder placement:
whisper_large_v3_encoder_fp16.safetensors
models/audio_encoders
Wan2_1 HuMo 14B_fp16.safetensors
models\diffusion_models
2. InfiniteTalk
https://huggingface.co/MeiGen AI/InfiniteTalk/tree/main/comfyui
Placement location: models/diffusion models
A 15-second MV production requires only one character reference image, a song longer than 15 seconds, and three MV descriptions. Utilizes the infinite Talk and Humo models.
Local deployment requires the following models to be installed:
1. Humo (KJ Quantized Version)
https://huggingface.co/Kijai/WanVideo_comfy/tree/main/HuMo
Folder placement:
whisper_large_v3_encoder_fp16.safetensors
models/audio_encoders
Wan2_1 HuMo 14B_fp16.safetensors
models\diffusion_models
2. InfiniteTalk
https://huggingface.co/MeiGen AI/InfiniteTalk/tree/main/comfyui
Placement location: models/diffusion models