Sign In

LTX2.3 ReStyle IC-LoRA

0

Updated: May 6, 2026

styleltxrestyleic-lora

Download

1 variant available

SafeTensor

768.1 MB

Verified:

Type

LoRA

Stats

53

0

Reviews

Published

May 6, 2026

Base Model

LTXV 2.3

Hash

AutoV2
22861DD2D7
default creator card background decoration
Cseti's Avatar

Cseti

"This is an early v0.1 release with known limitations. It transfers simpler styles (e.g. flat 2D / cel-shaded / monochrome line art), but struggles with more complex styles that involve texture, intricate detail, or strong material/lighting effects.

Quality often improves noticeably if you:

  • raise CFG to ~1.1 – 2.0 (CFG=1 is the distilled-model default)

  • use a non-distilled LTX-2.3 model

If I have enough compute, I'll keep working on a better version. Till then, enjoy and happy creating!

An IC-LoRA (in-context LoRA) for LTX-Video 2.3 (22B) trained for image-guided style transfer: given a source video and a single reference image describing the target style, the model re-renders the video in that style while preserving the original content and motion.

Training details

This IC-LoRA was trained on RunPod cloud GPUs.

  • Base model: Lightricks/LTX-2.3 (22B)

  • Training framework: ltx-trainer (Lightricks)

  • Training strategy: video-to-video IC-LoRA (first_frame_conditioning_p: 0.0, reference latents stream carries style)

  • Released checkpoint: step 8,000

  • LoRA rank / alpha: 128 / 128

  • Target modules: attn1.{to_k,to_q,to_v,to_out.0} + attn2.{to_k,to_q,to_v,to_out.0} (self + cross attention)

  • Optimizer: Prodigy

  • Scheduler: constant

  • Mixed precision: bf16

  • Batch size: 1 (gradient checkpointing on)

  • Timestep sampling: shifted_logit_normal

  • Resolution: trained at 768x448 @ 97 frames

  • Dataset: 562 cross-pair samples derived from the Ditto-1M style-transfer dataset (50 styles × ~11 pairs each). Each training reference is constructed by replacing frame 0 of the source video with the stylized first frame of a different pair from the same style

Inference

For inference I used ComfyUI. Workflow available here: Cseti/ComfyUI-Workflows — restyle-ic-lora.

Conditioning — both modalities supported, mixing them works best:

  • Image reference: a single still image in the requested style, fed as frame 0

  • Text prompt: e.g. Make it Disney 2D Animation style. / Make it watercolor style. — matches the training caption template (Make it {style} style.).

Strength: 1.0.

Prompting tips

The style reference image carries the primary signal; the text prompt reinforces and disambiguates it. A few patterns that help:

  • Match the training caption template: Make it {style} style. — e.g. Make it watercolor style., Make it Disney 2D Animation style.. The shorter form is the safe default.

  • A more detailed style description can help: Expanding the prompt with technique / medium / palette / lighting cues helps the model toward your intent.

Important Notes

This LoRA is created as part of a research project. The training data is derived from the publicly released Ditto-1M dataset; please respect the licensing terms of the source dataset and any source video content. Users utilize the model at their own risk and are obligated to comply with applicable copyright laws.

Acknowledgement

Special thanks to:

  • Lightricks for open-sourcing the LTX-2 trainer and the LTX-2.3 22B model

  • The authors of Ditto-1M for releasing the style-transfer dataset that made this LoRA possible

Support

Training models like this requires renting cloud GPUs, which gets expensive quickly. If you find this LoRA useful and would like me to keep contributing open-source models, your support is very much appreciated:

Ko-fi Liberapay