Alibaba-PAI
We explore the Reward Backpropagation technique 1 2 to optimized the generated videos by Wan2.1-Fun for better alignment with human preferences. We provide the following pre-trained models (i.e. LoRAs) along with the training script. You can use these LoRAs to enhance the corresponding base model as a plug-in or train your own reward LoRA.
Wan2.1-Fun-14B-InP-HPS2.1.safetensors
Official HPS v2.1 reward LoRA (rank=128 and network_alpha=64) for Wan2.1-Fun-14B-InP. It is trained with a batch size of 32 for 3,000 steps.
Converted By Kijai:
https://huggingface.co/Kijai/Wan2.1-Fun-Reward-LoRAs-comfy
Converted to ComfyUI compatible format from:
https://huggingface.co/alibaba-pai/Wan2.1-Fun-Reward-LoRAs/tree/main
Alibaba-PAI
We explore the Reward Backpropagation technique 1 2 to optimized the generated videos by Wan2.1-Fun for better alignment with human preferences. We provide the following pre-trained models (i.e. LoRAs) along with the training script. You can use these LoRAs to enhance the corresponding base model as a plug-in or train your own reward LoRA.
Wan2.1-Fun-14B-InP-HPS2.1.safetensors
Official HPS v2.1 reward LoRA (rank=128 and network_alpha=64) for Wan2.1-Fun-14B-InP. It is trained with a batch size of 32 for 3,000 steps.
Converted By Kijai:
https://huggingface.co/Kijai/Wan2.1-Fun-Reward-LoRAs-comfy
Converted to ComfyUI compatible format from:
https://huggingface.co/alibaba-pai/Wan2.1-Fun-Reward-LoRAs/tree/main