Qwen3.5-35B GGUF quants (16–22 GiB) - KLD + speed comparison

r/LocalLLaMA
Open Source AI AI Research AI Tools

Qwen3.5-35B GGUF quants (16-22 GiB) - KLD + speed comparison I'm back with some benchmarks. I benchmarked the KLD divergence of the actual Qwen3.5-35B-A3B GGUF quantizations (16-22 GiB) available on Hugging Face. KLD: The Kullback-Leibler divergence which shows how similar the FP16 and the quantized logit distributions are by measuring the difference in probability distributions between the quantized model and the FP16 baseline on a reference corpus.