umt5-xxl-encoder-Q5_K_S.gguf
Back

umt5-xxl-encoder-Q5_K_S.gguf
0 0 3

Photography

Illustration

Realistic

3D

umt5-xxl-encoder-Q5_K_S.gguf

UMT5-XXL Text Encoder (Q5_K_S GGUF) for WAN 2.2

This is a GGUF-quantized version of Google's Unified Multilingual T5 (UMT5) XXL encoder, specifically optimized for use with WAN 2.2 video and image generation models in ComfyUI.

What This Model Does: This text encoder converts your text prompts into numerical embeddings that guide the WAN 2.2 diffusion models during image and video generation. It's an essential component that helps the AI understand what you want to create.

Key Features:

Q5_K_S Quantization: 5-bit compression provides excellent quality while keeping file size manageable (4.05 GB vs 11.4 GB full precision)
Memory Efficient: Optimized for systems with 16GB+ VRAM
High Quality: Q5_K_S is the recommended quantization level, offering the best balance between quality and resource usage
Universal Compatibility: Works with all WAN 2.2 model variants (Text-to-Video, Image-to-Video, Text-to-Image)
Technical Specifications:

Model Size: 4.05 GB
Quantization: Q5_K_S (5-bit)
Format: GGUF
Base Model: Google UMT5-XXL
Parameters: 5.68B
Requirements:

ComfyUI with ComfyUI-GGUF extension
Place in: ComfyUI/models/clip/ directory
Use with: CLIPLoaderGGUF node
VRAM: Works well on 16GB+ systems
Recommended Use: This encoder is ideal for most users running WAN 2.2 workflows. If you need smaller file sizes for limited VRAM, consider Q3_K_S (2.86 GB). For maximum quality with ample VRAM (24GB+), use Q8_0 (6.04 GB).

Credits:

Original Model: Google (UMT5-XXL)
GGUF Conversion: city96
Compatible with: WAN 2.2 (Wan-Video Team)
This model is a critical component for anyone working with WAN 2.2 video generation workflows and provides reliable, high-quality text encoding for prompt-to-video generation.

This model is sourced from an external transfer (transfer address: https://huggingface.co/city96/umt5-xxl-encoder-gguf ),if the original author has objections to this transfer, you can click,
Appeal
We will, within 24 hours, edit, delete, or transfer the model to the original author according to the original author's request

Zachary Garner

Zachary Garner

Photography

Illustration

Realistic

3D

Model Information

Frozen
Original author:
city96
Model Type:
GGUF
Basic Model:
WAN2.2
Trigger Words:
Q5
Resource Name:
models/unet_gguf/umt5-xxl-encoder-Q5_K_S.gguf
MD5:
513c1ba392420453cd28fbd6f6b1c7db

UMT5-XXL Text Encoder (Q5_K_S GGUF) for WAN 2.2

This is a GGUF-quantized version of Google's Unified Multilingual T5 (UMT5) XXL encoder, specifically optimized for use with WAN 2.2 video and image generation models in ComfyUI.

What This Model Does: This text encoder converts your text prompts into numerical embeddings that guide the WAN 2.2 diffusion models during image and video generation. It's an essential component that helps the AI understand what you want to create.

Key Features:

Q5_K_S Quantization: 5-bit compression provides excellent quality while keeping file size manageable (4.05 GB vs 11.4 GB full precision)
Memory Efficient: Optimized for systems with 16GB+ VRAM
High Quality: Q5_K_S is the recommended quantization level, offering the best balance between quality and resource usage
Universal Compatibility: Works with all WAN 2.2 model variants (Text-to-Video, Image-to-Video, Text-to-Image)
Technical Specifications:

Model Size: 4.05 GB
Quantization: Q5_K_S (5-bit)
Format: GGUF
Base Model: Google UMT5-XXL
Parameters: 5.68B
Requirements:

ComfyUI with ComfyUI-GGUF extension
Place in: ComfyUI/models/clip/ directory
Use with: CLIPLoaderGGUF node
VRAM: Works well on 16GB+ systems
Recommended Use: This encoder is ideal for most users running WAN 2.2 workflows. If you need smaller file sizes for limited VRAM, consider Q3_K_S (2.86 GB). For maximum quality with ample VRAM (24GB+), use Q8_0 (6.04 GB).

Credits:

Original Model: Google (UMT5-XXL)
GGUF Conversion: city96
Compatible with: WAN 2.2 (Wan-Video Team)
This model is a critical component for anyone working with WAN 2.2 video generation workflows and provides reliable, high-quality text encoding for prompt-to-video generation.

This model is sourced from an external transfer (transfer address: https://huggingface.co/city96/umt5-xxl-encoder-gguf ),if the original author has objections to this transfer, you can click,
Appeal
We will, within 24 hours, edit, delete, or transfer the model to the original author according to the original author's request