I trained an anime image model in 2 days from scratch on 1 local GPU
r/StableDiffusion
•
Machine Learning
Generative AI
AI Hardware
AI Research
Using a combination of recent papers, I trained a 250M text-to-image anime model in 2 days from scratch (not a finetune of an existing diffusion model) on 1 local RTX Pro 6000 GPU. VAE: Trained in 8 hours using DINOv3 as the encoder Diffusion Model: Trained in 42 hours.