ggml: add Q1_0 1-bit quantization support (CPU) - 1-bit Bonsai models
r/LocalLLaMA
•
Generative AI
Bonsai's 8B model is just 1.15GB so CPU alone is than enough. submitted by /u/pmttyji [link] [comments]