NVIDIA admits to only 2x performance boost at max throughput with new generation of Rubin GPUs
r/LocalLLaMA
•
AI Hardware
NVIDIA admits to only 2x performance boost from Rubin at max throughput, which is what 99% of companies are running in production anyway. No sandbagging comparing chips with 80GB vram to 288GB vram. They're forced to compare apples for apples. Despite Rubin having almost 3x the memory bandwidth and apparently 5x the FP4 perf, that results in only 2x the output throughput. At 1000W TDP for B200 vs 2300W R200. So you're using 2.3x the power per GPU to get 2x performance. Not really efficient, is it? submitted by /u/bigboyparpa [link] [comments.