ggml: add Q1_0 1-bit quantization support (CPU) - 1-bit Bonsai models

r/LocalLLaMA
Generative AI

Bonsai's 8B model is just 1.15GB so CPU alone is than enough. submitted by /u/pmttyji [link] [comments]