Gemma 4 26B Hits 600 Tok/s on One RTX 5090

r/LocalLLaMA • May 08, 2026

Open Source AI AI Research AI Tools

I ran a benchmark to see how much DFlash speculative decoding actually helps in vLLM.