Just released open-source ACE Step music generation and Float digital human

Performed track separation to help the digital human better recognize vocals

Thus accurately matching lip movements.

Local 8G VRAM is sufficient to run

Detailed usage explanation in the comments