Let’s talk quants of Gemma and Qwen - 16 vs Q8 vs Q4 - any experiences?

r/LocalLLaMA
Open Source AI

Some people say they’d never go under Q8, and others say they find Q3 acceptable! What’s your take? submitted by /u/Borkato [link] [comments]