Super powerful latest digital human MultiTalk, high-quality digital humans can be generated with images and audio, with excellent speaking and singing quality

Various optimizations in this workflow:

1: One-click intelligent expansion of prompt words by large models, generating better quality

2: Can set the maximum video side length, keeping the original width and height if not exceeded

3: Supports intelligent audio clipping, 0 means default length same as the original video


Recommended to use 48G VRAM for better results, default side length is 832, 11 seconds require 7-10 minutes processing time