
VoiceGate is a cross-language video intelligent dubbing engine built on VoxCPM2 and ComfyUI. VoxCPM2 supports 30 languages (including eight Southeast Asian languages) and 9 Chinese dialects (Cantonese, Sichuanese, Wu, Northeastern Mandarin, Minnan, etc.), offering voice cloning and timbre design capabilities. The engine achieves frame-level alignment of TTS voice and SRT subtitle timestamps through the self-developed VoiceBridge plugin, ensuring precise synchronization between dubbing and visuals.
The complete workflow covers ASR subtitle extraction, LLM translation, multilingual TTS, and audio alignment and merging, with visualized node graph orchestration, ready to use out of the box.
Input: Video and target language
Output: Clone the timbre of the input video, generate a video in the target language, and simultaneously output corresponding subtitles. The output audio aligns with the input video at the subtitle level.
Please copy the target language from the list below: Arabic, Burmese, Chinese, Danish, Dutch, English, Finnish, French, German, Greek, Hebrew, Hindi, Indonesian, Italian, Japanese, Khmer, Korean, Lao, Malay, Norwegian, Polish, Portuguese, Russian, Spanish, Swahili, Swedish, Tagalog, Thai, Turkish, Vietnamese
GitHub Project:
https://github.com/YanTianlong 01/VoiceGate
VoiceGate is a cross-language video intelligent dubbing engine built on VoxCPM2 and ComfyUI. VoxCPM2 supports 30 languages (including eight Southeast Asian languages) and 9 Chinese dialects (Cantonese, Sichuanese, Wu, Northeastern Mandarin, Minnan, etc.), offering voice cloning and timbre design capabilities. The engine achieves frame-level alignment of TTS voice and SRT subtitle timestamps through the self-developed VoiceBridge plugin, ensuring precise synchronization between dubbing and visuals.
The complete workflow covers ASR subtitle extraction, LLM translation, multilingual TTS, and audio alignment and merging, with visualized node graph orchestration, ready to use out of the box.
Input: Video and target language
Output: Clone the timbre of the input video, generate a video in the target language, and simultaneously output corresponding subtitles. The output audio aligns with the input video at the subtitle level.
Please copy the target language from the list below: Arabic, Burmese, Chinese, Danish, Dutch, English, Finnish, French, German, Greek, Hebrew, Hindi, Indonesian, Italian, Japanese, Khmer, Korean, Lao, Malay, Norwegian, Polish, Portuguese, Russian, Spanish, Swahili, Swedish, Tagalog, Thai, Turkish, Vietnamese
GitHub Project:
https://github.com/YanTianlong 01/VoiceGate