Sign In

Z-Image Base + ControlNet Workflow

Updated: May 9, 2026

character

Download

1 variant available

Archive Other

3.76 KB

Verified:

Type

Workflows

Stats

51

Reviews

Published

May 9, 2026

Base Model

ZImageTurbo

Hash

AutoV2
080C8C783D
default creator card background decoration
AIKSK's Avatar

AIKSK

This ComfyUI workflow is designed for Z-Image Base generation with ControlNet structure guidance. The main purpose of this workflow is to let creators use a reference image as a structural control source, extract pose or composition information through a preprocessor, and then guide Z-Image Base to generate a new image that follows the reference layout while still being driven by a detailed text prompt.

The workflow is built around Z-Image Base, using z_image_bf16.safetensors as the main diffusion model, qwen_3_4b.safetensors as the Qwen Image text encoder, and ae.safetensors as the VAE. The key control module is Z-Image-Fun-Controlnet-Union-2.1.safetensors, loaded through ModelPatchLoader and applied through the ZImageFunControlnet node. This gives the workflow a ControlNet-style generation route for Z-Image Base.

Unlike a normal text-to-image workflow, this graph does not rely only on prompt descriptions. It uses a control image extracted from the input reference. The input image is loaded into ComfyUI, then processed through AIO_Preprocessor with DWPreprocessor. This creates a pose or structure guidance image that can be passed into the ControlNet module. The ControlNet then helps the final generation follow the body pose, framing, angle, or major composition from the reference image.

This is useful when text prompts alone are not enough. For example, if you want a character to hold a specific pose, follow a diagonal action composition, match a camera angle, or preserve the structure of a dynamic scene, pure prompting can be unstable. Z-Image Base may create a good image, but the pose or layout may drift. With ControlNet guidance, the workflow gives the model a stronger spatial reference.

The workflow uses ZImageFunControlnet as the main bridge between the Z-Image model, the ControlNet patch, the VAE, and the preprocessed control image. The strength value controls how strongly the control image affects the final output. In the included setup, the control strength is set to 1, which means the workflow is configured for a clear structural influence. Users can reduce the strength if they want more creative freedom, or keep it high when pose and composition accuracy are more important.

The text conditioning route uses CLIPTextEncode with the Qwen Image text encoder. The positive prompt can describe the subject, style, camera angle, lighting, action, material detail, atmosphere, and final image quality. The uploaded example uses a highly detailed cinematic action prompt with combat composition, dramatic lighting, fog, sparks, motion blur, and hyperreal material texture. This shows that the workflow is suitable for complex scene generation, not just simple character posing.

The negative prompt is designed to suppress common visual problems such as bad lighting, dark or gloomy output, overexposure, underexposure, low contrast, grayscale, monochrome, draft-like rendering, sketch effects, crayon texture, comic output, or cartoon-like results when the target is realism. Users can adjust this negative prompt depending on whether they want realistic, anime, concept art, or stylized output.

The workflow uses ModelSamplingAuraFlow with a shift value around 3. This prepares the Z-Image Base model for the intended sampling behavior. The generation canvas is created through EmptyLatentImage. In the included setup, the latent size is 720 x 1280, which is a vertical portrait-oriented format suitable for character images, social media covers, action poster concepts, and mobile-friendly visual outputs.

The KSampler stage uses around 30 steps, CFG 3, Euler sampler, simple scheduler, and full denoise. Since this is a text-to-image generation workflow with ControlNet guidance, denoise is set to 1. The ControlNet image provides the structure, while the prompt defines the final visual content. The output is decoded through VAEDecode and saved with SaveImage.

This workflow is especially useful for creators who need more controllable Z-Image Base generation. It can be used for pose-guided character creation, action scene composition, cinematic poster generation, anime or realistic character posing, reference-based layout control, and creative concept art production. The workflow is also suitable for Civitai examples, RunningHub online demos, YouTube tutorials, Bilibili workflow showcases, and repeatable prompt experiments.

Main features:

- Z-Image Base + ControlNet workflow

- Uses z_image_bf16.safetensors

- Uses Qwen 3 4B text encoder

- Uses ae.safetensors VAE

- Uses Z-Image-Fun-Controlnet-Union-2.1 model patch

- ZImageFunControlnet control route

- DWPreprocessor structure / pose extraction

- Reference image to control image pipeline

- Vertical 720 x 1280 generation setup

- ModelSamplingAuraFlow support

- Prompt-based controlled generation

- Negative prompt for lighting and quality control

- KSampler generation with Euler sampler

- VAEDecode and SaveImage final output

- Suitable for pose control, composition control, and cinematic image generation

Recommended use cases:

Z-Image Base ControlNet testing, pose-guided image generation, reference composition transfer, action scene generation, character pose control, cinematic poster creation, AI cover image generation, fantasy or sci-fi scene layout control, anime character posing, realistic character rendering, concept art generation, social media visual production, Civitai showcase image creation, RunningHub workflow publishing, and prompt-control comparison testing.

Suggested workflow:

Start by preparing a clear reference image. The reference should contain the pose, silhouette, or composition you want the final image to follow. If the pose is too unclear, blocked, or small in the frame, the DWPreprocessor output may not provide enough useful structure.

Load the reference image into the workflow. The AIO_Preprocessor section will process it with DWPreprocessor and create a control image. Check the control result if possible. A good control image should clearly show the pose or structural layout. If the control image is weak, use a cleaner source image or adjust the preprocessor resolution.

Write a detailed positive prompt. Since the ControlNet handles structure, the prompt should focus on what the final image should become. Describe the subject, outfit, scene, lighting, atmosphere, material texture, camera angle, style, and quality level. For action scenes, describe direction of movement, lighting focus, and the main visual conflict clearly.

Use the negative prompt to control unwanted styles and artifacts. If you are aiming for realism, keep negative terms such as sketch, crayon, comic, cartoon, grayscale, low contrast, overexposed, and underexposed. If you are aiming for anime or illustration, adjust the negative prompt so it does not fight the intended style.

ControlNet strength is important. A higher strength makes the result follow the reference pose or structure more strongly. A lower strength gives the model more freedom to reinterpret the scene. For accurate pose control, keep strength high. For more creative composition, lower it slightly.

Use the 720 x 1280 canvas for vertical poster or character images. If you need square or landscape output, adjust the EmptyLatentImage width and height. Keep model-friendly dimensions and avoid extreme aspect ratios during early testing.

Run several seeds when testing. ControlNet can preserve structure, but different seeds still change detail, lighting, clothing, background, and style. If the structure is correct but the image quality is not ideal, keep the same prompt and test another seed. If the structure is wrong, check the control image and ControlNet strength first.

When evaluating the result, check whether the final image follows the reference pose, whether the prompt content is respected, whether the character anatomy is stable, and whether lighting and composition look coherent. A good result should combine the reference structure with the new visual identity described in the prompt.

This workflow is designed as a practical Z-Image Base ControlNet pipeline for ComfyUI users. It provides a direct route from reference image to controlled generation, making it useful for creators who want stronger pose control, better composition consistency, and more predictable Z-Image Base outputs.

🎥 YouTube Video Tutorial

Want to know what this workflow actually does and how to start fast?

This video explains what the tool is, how to launch the workflow instantly, and shares my core design logic — no local setup, no complicated environment.

Everything starts directly on RunningHub, so you can experience it in action first.

👉 YouTube Tutorial: https://youtu.be/mYpdxdHGlQM

Before you begin, I recommend watching the video thoroughly — getting the full context helps you understand the tool faster and avoid common detours.

⚙️ RunningHub Workflow

Try the workflow online right now — no installation required.

👉 Workflow: https://www.runninghub.ai/post/2022640984902864898/?inviteCode=rh-v1111

If the results meet your expectations, you can later deploy it locally for customization.

🎁 Fan Benefits: Register to get 1000 points + daily login 100 points — enjoy 4090 performance and 48 GB super power!

📺 Bilibili Updates (Mainland China & Asia-Pacific)

If you’re in the Asia-Pacific region, you can watch the video below to see the workflow demonstration and creative breakdown.

📺 Bilibili Video: https://www.bilibili.com/video/BV1quZ7BpEPE/

☕ Support Me on Ko-fi

If you find my content helpful and want to support future creations, you can buy me a coffee ☕.

Every bit of support helps me keep creating — just like a spark that can ignite a blazing flame.

👉 Ko-fi: https://ko-fi.com/aiksk

💼 Business Contact

For collaboration or inquiries, please contact aiksk95 on WeChat.

🎥 YouTube 视频教程

想了解这个工作流到底是怎样的工具,以及如何快速启动?

视频主要介绍 工具定位、快速启动方法 和 我的构筑思路。

我们会直接在 RunningHub 上进行演示,让你第一时间看到实际效果。

👉 YouTube 教程: https://youtu.be/mYpdxdHGlQM

开始前建议尽量完整地观看视频 —— 把握整体思路会更快上手,也能少走常见弯路。

⚙️ 在线体验工作流

现在就可以在线体验,无需安装。

👉 工作流: https://www.runninghub.ai/post/2022640984902864898/?inviteCode=rh-v1111

打开上方链接即可直接运行该工作流,实时查看生成效果。

如果觉得效果理想,你也可以在本地进行自定义部署。

🎁 粉丝福利: 注册即送 1000 积分,每日登录 100 积分,畅玩 4090 体验 48 G 超级性能!

📺 Bilibili 更新(中国大陆及南亚太地区)

如果你在中国大陆或南亚太地区,可以通过下方视频查看该工作流的实测效果与构思讲解。

📺 B站视频: https://www.bilibili.com/video/BV1quZ7BpEPE/

我会在 夸克网盘 持续更新模型资源:

👉 https://pan.quark.cn/s/20c6f6f8d87b

这些资源主要面向本地用户,方便进行创作与学习。