FantasyTalking wan digital human workflow (static portrait → hyper-realistic talking video)
233
1
20

D-Human

Image-to-Video

FantasyTalking wan digital human workflow (static portrait → hyper-realistic talking video)

FantasyTalking project jointly launched by Alibaba and Beijing University of Posts and Telecommunications is another major breakthrough in the digital human project. With just one ID photo, it can generate digital human videos with vivid expressions and natural movements.

Three major innovative modules
Audio-visual alignment strategy: Captures the global correlation between audio and facial expressions, body movements, and background dynamics
Facial cross-attention: Locks facial features with only 3% parameter volume, identity drift rate of 10-minute video <0.3%
Motion intensity modulation network: Independently controls 22 parameters for facial/body amplitude (e.g., eyebrow height, shoulder swing frequency)
Breakthrough in generation effects
Supports 9 generation modes: Close-up/half-body/full-body front view/side view/dynamic background
Covers multiple styles including real person/cartoon/animal, lip-sync error <40ms
360° surround view generation with realistic details such as hair fluttering and neck wrinkles
Performance comparison advantages
In the OmniHuman 1 benchmark test, leading in motion continuity (CIDEr↑18%) and identity preservation (SSIM↑23%)

Model download link: https://pan.quark.cn/s/184684a6d030

233

Download

Open AI App

嘟嘟AI绘画趣味学

2025-05-05 Update

D-Human

Image-to-Video

嘟嘟AI绘画趣味学

2025-05-05 Update

Workflow introduction

Nodes Information

Primitive Nodes (3)

CLIPVisionLoader

LoadImage

PreviewImage

Custom Nodes (23)

CreateCFGScheduleFloatList

DownloadAndLoadWav2VecModel

FantasyTalkingModelLoader

FantasyTalkingWav2VecEmbeds

INTConstant

ImpactInt

LayerUtility: ImageScaleByAspectRatio V2

LoadAudio

LoadWanVideoT5TextEncoder

MathExpression|pysssss

Note

Note Plus (mtb)

VHS_VideoCombine

WanVideoBlockSwap

WanVideoClipVisionEncode

WanVideoContextOptions

WanVideoDecode

WanVideoImageToVideoEncode

WanVideoModelLoader

WanVideoSampler

WanVideoTeaCache

WanVideoTextEncode

WanVideoVAELoader

FantasyTalking wan digital human workflow (static portrait → hyper-realistic talking video) 233120

D-Human

Image-to-Video

FantasyTalking wan digital human workflow (static portrait → hyper-realistic talking video)
233
1
20