Quantisation effects of Qwen3.6 35b a3b

r/LocalLLaMA
Generative AI

Im curious how people are finding the quantisation effects of 35b. I recently updated to 48GB of vram so have jumped from ud-q4_k_xl​ to q8 and the difference feels stark. Just effective tool calling, seems to get the vagueness and nuance etc of some prompts., and provide well rounded answers on some research like questions. It w​as a quick vibe​ test, admittedly, but I'm going t​o​ try ud-q6_k_xl soon to see how of the 5+GB vram is worth the quality, but I'm curious to see others findings.