Gemma4 26b MoE running in MLX with turboquant (and custom kernel)

r/LocalLLaMA
Generative AI

TL;DR I spent a few crazy evenings this past week seeing if I could get Gemma4 running with proper turbo quant and rotating KV cache.