Qwen3.5-35B GGUF quants (16–22 GiB) - KLD + speed comparison
r/LocalLLaMA
•
Open Source AI
AI Research
AI Tools
Qwen3.5-35B GGUF quants (16-22 GiB) - KLD + speed comparison I'm back with some benchmarks. I benchmarked the KLD divergence of the actual Qwen3.5-35B-A3B GGUF quantizations (16-22 GiB) available on Hugging Face. KLD: The Kullback-Leibler divergence which shows how similar the FP16 and the quantized logit distributions are by measuring the difference in probability distributions between the quantized model and the FP16 baseline on a reference corpus.