V100 32 Gb : 6h of benchmarks across 20 models with CPU offloading & power limitations
r/LocalLLaMA
•
AI Hardware
Open Source AI
AI Research
I posted a few days ago about my setup here: - Ryzen 7600 X & 32 Gb DDR5 - Nvidia V100 32 GB PCIExp (air cooled) I run a 6h benchmarks across 20 models (MOE & dense), from Nemotron…Qwen to Deepseek 70B with different configuration of: - Power limitation (300w, 250w, 200w, 150w) - CPU Offload (100% GPU, 75% GPU, 50% GPU, 25% GPU, 0% GPU) - Different context window (up to 32K) TLDR: - Power limiting is free for generation. Running at 200W saves 100W with <2% loss on tg128. MoE/hybrid models are bandwidth-bound. Only dense prompt processing shows degradation at 150W (−22.