Skip to main content
Z-Image (้€ ็›ธ) is a powerful and highly efficient image generation model with 6B parameters, developed by Alibabaโ€™s Tongyi Lab. It uses a Scalable Single-Stream DiT (S3-DiT) architecture where text, visual semantic tokens, and image VAE tokens are concatenated at the sequence level to serve as a unified input stream, maximizing parameter efficiency. Model Variants:
  • ๐Ÿš€ Z-Image-Turbo โ€“ A distilled version that matches or exceeds leading competitors with only 8 NFEs (Number of Function Evaluations). It offers sub-second inference latency on enterprise-grade H800 GPUs and fits within 16GB VRAM consumer devices.
  • ๐Ÿงฑ Z-Image-Base โ€“ The non-distilled foundation model for community-driven fine-tuning and custom development.
  • โœ๏ธ Z-Image-Edit โ€“ A variant fine-tuned for image editing tasks with impressive instruction-following capabilities.
Model Highlights:
  • Photorealistic Quality: Delivers strong photorealistic image generation while maintaining excellent aesthetic quality
  • Accurate Bilingual Text Rendering: Excels at accurately rendering complex Chinese and English text
  • Prompt Enhancing & Reasoning: Prompt Enhancer empowers the model with reasoning capabilities
  • Sub-second Inference: Achieves fast generation speed on supported hardware
Related Links:

Z-Image-Turbo text-to-image workflow

Download JSON Workflow File

Run on ComfyUI Cloud

Make sure your ComfyUI is updated.Workflows in this guide can be found in the Workflow Templates. If you canโ€™t find them in the template, your ComfyUI may be outdated. (Desktop versionโ€™s update will delay sometime)If nodes are missing when loading a workflow, possible reasons:
  1. You are not using the latest ComfyUI version (Nightly version)
  2. Some nodes failed to import at startup
text_encoders diffusion_models vae Model Storage Location
๐Ÿ“‚ ComfyUI/
โ”œโ”€โ”€ ๐Ÿ“‚ models/
โ”‚   โ”œโ”€โ”€ ๐Ÿ“‚ text_encoders/
โ”‚   โ”‚      โ””โ”€โ”€ qwen_3_4b.safetensors
โ”‚   โ”œโ”€โ”€ ๐Ÿ“‚ diffusion_models/
โ”‚   โ”‚      โ””โ”€โ”€ z_image_turbo_bf16.safetensors
โ”‚   โ””โ”€โ”€ ๐Ÿ“‚ vae/
โ”‚          โ””โ”€โ”€ ae.safetensors

Z-Image-Turbo Fun Union ControlNet workflow

This workflow uses the Z-Image-Turbo Fun Union ControlNet model to generate images with ControlNet guidance. It applies Canny edge detection to a reference image and uses the ControlNet to guide the generation process.

Download JSON Workflow File

Additional model for ControlNet

model_patches Model Storage Location
๐Ÿ“‚ ComfyUI/
โ”œโ”€โ”€ ๐Ÿ“‚ models/
โ”‚   โ”œโ”€โ”€ ๐Ÿ“‚ text_encoders/
โ”‚   โ”‚      โ””โ”€โ”€ qwen_3_4b.safetensors
โ”‚   โ”œโ”€โ”€ ๐Ÿ“‚ diffusion_models/
โ”‚   โ”‚      โ””โ”€โ”€ z_image_turbo_bf16.safetensors
โ”‚   โ”œโ”€โ”€ ๐Ÿ“‚ vae/
โ”‚   โ”‚      โ””โ”€โ”€ ae.safetensors
โ”‚   โ””โ”€โ”€ ๐Ÿ“‚ model_patches/
โ”‚          โ””โ”€โ”€ Z-Image-Turbo-Fun-Controlnet-Union.safetensors