I built a compression format for AI models — 60-80% smaller, need help testing

r/StableDiffusion
Generative AI Open Source AI AI Research

Round 2 FIGHT! Hey everyone - some of you might remember my VRAM pager project from a couple of days back. Ultimately I was a little late to that party but sometimes stepping back leads us to other innovations I created a new compression method for models and would greatly appreciate some help testing it, its called DMX. Results so far: - 9.1 GB model → 1.8 GB (80% smaller) - 7.2 GB model → 1.5 GB (79.5% smaller) - Llama 3 8B: only +0.16% perplexity loss Where I need your help: - Try it on models I haven't - especially Mixtral, FLUX, Gemma - Try to break it.