nvidia/Gemma-4-26B-A4B-NVFP4

Can confirm it works on a 5090, with 80% allocation (of 32gb) I got around 50k context. It's 18.8GB Benchmark Baseline (Full Precision) NVFP4 GPQA Diamond 80.30% 79.90% AIME 2025 88.95% 90.00% MMLU Pro 85.00% 84.80% LiveCodeBench (pass) 80.50% 79.80% IFBench 77.77% 78.1% IFEval 96.60% 96.40% submitted by /u/reto-wyss [link] [comments]