Sign In

HDR VAE (Anima - QWEN Image)

Updated: Jun 21, 2026

stylehdrfp32

Download

1 variant available

SafeTensor

242.05 MB

Verified:

Type

VAE

Stats

151

Reviews

Published

Jun 21, 2026

Base Model

Qwen

Hash

AutoV2
D51D07F007
SDXL Training Contest Participant
Felldude's Avatar

Felldude

License:

Apache 2.0

Qwen Image VAE

  • Full FP32 Training of Decoder

  • Works in ComfyUI

Feel free to to suggest onsite support, to civitai staff. I don't think they have any agreements like with FLUX


Overview

This model is a fine-tuned variant of the base Qwen Image VAE, modified to emphasize high-frequency detail preservation and expanded color representation, following an HDR-style reconstruction objective.

The evaluation compares the base and HDR-tuned models using perceptual, structural, distributional, and photometric metrics over identical input data.


Evaluation Summary

Perceptual Fidelity (LPIPS)

  • Base: 0.0177

  • HDR: 0.0786

The HDR model exhibits a significant increase in perceptual distance, indicating reduced strict identity reconstruction under deep feature similarity metrics and a shift toward detail-enhancing reconstruction behavior.


Structural Energy (Gradient Magnitude)

  • Ground Truth: 404.02 (both models)

  • Base Reconstruction: 313.46

  • HDR Reconstruction: 687.97

The base model demonstrates strong low-pass behavior with reduced high-frequency content. In contrast, the HDR model exhibits high-frequency amplification, exceeding the structural energy of the original inputs.


Color Distribution Support

  • Ground Truth: 33150.61 (both models)

  • Base Reconstruction: 35004.49

  • HDR Reconstruction: 40133.37

The HDR model produces a substantially expanded color support space, indicating increased chromatic dispersion and reduced quantization collapse.


Photometric Stability

Brightness Bias

  • Base: 0.000351

  • HDR: 0.0000098

Contrast Gain

  • Base: 0.9984

  • HDR: 0.99999

Both models preserve global photometric consistency, with the HDR variant showing near-perfect affine stability.


Channel Drift

  • Red Shift:

    • Base: +0.0116

    • HDR: +0.0104

  • Green Shift:

    • Base: -0.0606

    • HDR: -0.1856

  • Blue Shift:

    • Base: +0.0187

    • HDR: +0.0219

The HDR model introduces a significantly stronger negative bias in the green channel, while maintaining comparable red and blue stability.


Interpretation

The base Qwen VAE behaves as a contractive perceptual projection operator, prioritizing smooth reconstructions and suppression of high-frequency components.

The HDR-tuned variant transitions into a detail-amplifying reconstruction operator, characterized by:

  • Increased high-frequency energy

  • Expanded color manifold coverage

  • Higher perceptual divergence under LPIPS

  • Preserved global photometric invariance

This represents a functional shift from a smoothing autoencoder regime toward a high-frequency preserving (HDR-like) reconstruction regime.