Quantisation effects of Qwen3.6 35b a3b
r/LocalLLaMA
•
Generative AI
Im curious how people are finding the quantisation effects of 35b. I recently updated to 48GB of vram so have jumped from ud-q4_k_xl to q8 and the difference feels stark. Just effective tool calling, seems to get the vagueness and nuance etc of some prompts., and provide well rounded answers on some research like questions. It was a quick vibe test, admittedly, but I'm going to try ud-q6_k_xl soon to see how of the 5+GB vram is worth the quality, but I'm curious to see others findings.