Sign In

Kohya improvements to FLUX LoRA (4 GB GPUs) and DreamBooth / Fine-Tuning (6 GB GPUs) training

13

Nov 17, 2024

(Updated: 15 hours ago)

news
Kohya improvements to FLUX LoRA (4 GB GPUs) and DreamBooth / Fine-Tuning (6 GB GPUs) training

Kohya brought massive improvements to FLUX LoRA (as low as 4 GB GPUs) and DreamBooth / Fine-Tuning (as low as 6 GB GPUs) training

  • You can download all configs and full instructions

  • Kohya brought massive improvements to FLUX LoRA and DreamBooth / Fine-Tuning (min 6GB GPU) training.

  • Now as low as 4GB GPUs can train FLUX LoRA with decent quality and 24GB and below GPUs got a huge speed boost when doing Full DreamBooth / Fine-Tuning training

  • You need minimum 4GB GPU to do a FLUX LoRA training and minimum 6 GB GPU to do FLUX DreamBooth / Full Fine-Tuning training. It is just mind blowing.

  • You can download all configs and full instructions > https://www.patreon.com/posts/112099700

  • The above post also has 1-click installers and downloaders for Windows, RunPod and Massed Compute

  • The model downloader scripts also updated and downloading 30+GB models takes total 1 minute on Massed Compute

  • You can read the recent updates here : https://github.com/kohya-ss/sd-scripts/tree/sd3?tab=readme-ov-file#recent-updates

  • This is the Kohya GUI branch : https://github.com/bmaltais/kohya_ss/tree/sd3-flux.1

  • Key thing to reduce VRAM usage is using block swap

  • Kohya implemented the logic of OneTrainer to improve block swapping speed significantly and now it is supported for LoRAs as well

  • Now you can do FP16 training with LoRAs on 24 GB and below GPUs

  • Now you can train a FLUX LoRA on a 4 GB GPU - key is FP8, block swap and using certain layers training (remember single layer LoRA training)

  • It took me more than 1 day to test all newer configs, their VRAM demands, their relative step speeds and prepare the configs :)

13