Voice-driven image-speaking digital human workflow