This LoRA is a personal proof-of-concept to verify whether Chroma LoRA training is feasible on a local system with a 16GB VRAM GPU.
The dataset consists of in-game screenshots, and captions were automatically generated using the Qwen3-VL-8B model following a consistent set of rules.
Note: The following instructions assume the use of ComfyUI. I have no experience with other tools and cannot provide guidance for them.
A lora key not loaded error may appear in ComfyUI, but it does not affect image generation.
At the default weight of 1.0, images are generated in a 3D graphic style resembling the original game. Lowering the weight causes the character to fade, but produces a more realistic result. The showcase images were generated at a weight of 0.7 with a resolution of 960×1536.
Recommended Settings
Diffusion Model: GonzaLomo v3.0 flash
Sampler: ClownsharKSampler(Custom node) — exponential / res_2s, or KSampler — dpmpp_2m / res_2m / res_2s
Scheduler: bong_tangent, beta, or beta57
CFG: 1
Steps: 10
If you switch to a different base model, you will need to find settings appropriate for that model.
Training Details
The training was done using OneTrainer. Below are the system specs and training configuration. I believe Chroma has the potential to be a viable alternative to Pony. I hope more people will train and share Chroma LoRAs — that's why I'm sharing these details, and I hope they serve as a useful reference.
System
GPU: RTX 4060 Ti 16GB VRAM
RAM: 32GB
Dataset & Training
Images: 30
Epochs: 50
Total Steps: 1,500 (~6 hours)
Hyperparameters
Optimizer: ADAMW_8BIT
LR Scheduler: COSINE
Learning Rate: 0.0003
LR Warmup Steps: 0.05
Local Batch Size: 1
Train Text Encoder: Enabled
Text Encoder Learning Rate: 1e-05
Layer Offload Fraction: 0.5
Resolution: 1024
LoRA Rank: 32
LoRA Alpha: 64.0
Tips
20–40 high-quality images are sufficient for training a character LoRA. Adding more images does not necessarily improve quality. In terms of total steps, convergence typically occurs within 2,000 steps — training beyond that tends to result in a loss of image sharpness. In my experience, anything above 2,500 steps is unnecessary.
A deliberately high learning rate was used in this LoRA to lock down the character. If you find the character is too rigid and lacks flexibility, try lowering the LR to 0.0002 or below 0.0001, or reduce the alpha to 32 or 16.

