Gemma 4 MTP vs DFlash on 1x H100: dense vs MoE results

r/LocalLLaMA
AI Hardware Open Source AI AI Research AI Tools

Benchmarked Gemma 4 MTP and z-lab's DFlash on a single H100 80GB using vLLM and NVIDIA's SPEED-Bench qualitative dataset.