Let’s talk quants of Gemma and Qwen - 16 vs Q8 vs Q4 - any experiences?
r/LocalLLaMA
•
Open Source AI
Some people say they’d never go under Q8, and others say they find Q3 acceptable! What’s your take? submitted by /u/Borkato [link] [comments]