home models images videos 3D Models articles comics bounties challenges updates shop

NVIDIA PiD Flux1 — Smart 4× Detail Upscaler

Name: NVIDIA PiD Flux1 — Smart 4× Detail Upscaler
Rating: 5 (12 reviews)
Author: AIMotionStudio

617

Updated: Jun 4, 2026

base model

upscaler flux1 auto caption florence2 4x upscale

Download

1 variant available

Archive Other

NVIDIApid_flux1_upscaler.zip

4.75 KB

Verified: 2 months ago

Download (4.75 KB)

Details

Type

Workflows

Stats

617

Reviews

Positive

(12)

Published

Jun 4, 2026

Base Model

Flux.1 S

Training

Steps: 8

Epochs: 10

Hash

AutoV2

8206EE6537

About this version

default creator card background decoration

#10

971

1.5K

AIMotionStudio

Joined Dec 6, 2023

License:

Apache 2.0

A 1-click, 4-step upscaler built around NVIDIA's PiD (Pixel Diffusion) model on the Flux1 backbone. Feed it any image and it intelligently regenerates real, fine detail at 4× resolution — then supersamples it back down to a razor-sharp final image. An automatic captioner keeps the upscale faithful to your image's content, so you get crisp detail without hallucinated junk.

✨ What makes this workflow different

Works with ANY aspect ratio — 16:9, 9:16, square, anything. The input is auto-normalized to the model's native 1024px long-edge (e.g. a 1280×720 image becomes 1024×576), which is the #1 cause of broken/green-tinted PiD results. This is handled for you.
Auto-prompting via Florence-2 — a vision model reads your image and writes a detailed caption automatically, guiding the upscaler to preserve the actual content. No manual prompting needed.
Locked to the model's true 4× regime — PiD 1024→4096 is a fixed 4× model; this workflow targets exactly that for maximum sharpness (no blur from under/over-scaling).
Supersampled final output — the 4096px result is Lanczos-downscaled back to 1024px, baking all 4× of generated detail into a clean, antialiased image with zero quality loss. You also get the full 4096px version saved.
Side-by-side comparison — a built-in slider compares your original against the result.
Fast — only 4 sampling steps (LCM).

🔧 How to use

Install the 3 model files (links + folder layout below).
Load the workflow, drop your image into the Load Image node.
Hit Queue. That's it.

Outputs:

Full 4× image (e.g. 4096×2304) — saved via the first Save node.
Supersampled 1024 image — crisp, detail-packed, saved via the second Save node.

Tip: The PiD model is brightest/cleanest on near-square and landscape images. Extreme portrait crops can show mild color shifts — that's a known characteristic of the current distilled model, not the workflow.

📥 Downloads Links Below:

Gemma 2b: https://huggingface.co/Comfy-Org/PixelDiT/tree/main/text_encoders
PiD models: https://huggingface.co/Comfy-Org/PixelDiT/tree/main/diffusion_models
VAE: https://huggingface.co/Comfy-Org/z_image_turbo/tree/main/split_files/vae

📂 ComfyUI/

├── 📂 models/

│ ├── 📂 text_encoders/

│ │ └── gemma_2_2b_it_elm_bf16.safetensors

│ ├── 📂 vae/

│ │ └── ae.safetensors

│ └── 📂 diffusion_models/

│ └── pid_flux1_1024_to_4096_4step_bf16.safetensors

📋 Required custom nodes

ComfyUI-Florence2 (auto-captioning)
ComfyUI-Custom-Scripts (pythongosssss — ShowText / MathExpression)
ComfyUI-easy-use
rgthree-comfy (Image Comparer)
ComfyUI_Swwan (GetImageSizeAndCount)

Requires a recent ComfyUI build with PixelDiT / PiD support. ~16GB VRAM recommended for the full 4096px pass.

https://www.youtube.com/@AiMotionStudio