Benchmark: ik_llama.cpp vs llama.cpp on Qwen3/3.5 MoE Models
r/LocalLLaMA
•
Generative AI
AI Hardware
Open Source AI
AI Research
Hey folks, I ran a series of benchmarks comparing ik_llama.cpp against the official llama.cpp across multiple Qwen3 and Qwen3.5 variants (including MoE architectures). The results showed some interesting performance flips depending on the model architecture and backend provider. Hardware: CPU: Ryzen 9 5950x RAM: 64GB DDR4 GPU: RTX 5070 Ti 1.