Speed Benchmark: GLM 4.7 Flash vs Qwen 3.5 27B vs Qwen 3.5 35B A3B (Q4 Quants)

r/LocalLLaMA
AI Hardware Open Source AI AI Research

Speed Benchmark: GLM 4.7 Flash vs Qwen 3.5 27B vs Qwen 3.5 35B A3B (Q4 Quants) Tested how fast these three thinking models run on my setup. Didn't check output quality at all, just raw speed. I was using LM Studio with the max context being 64k and GPU offload at max for each model.