I have (even faster) DeepSeek V4 Pro at home

r/LocalLLaMA
Generative AI Open Source AI

Few days ago I posted about my DeepSeek V4 Pro at home - now time for an update. Yesterday I finally managed to run this model in ktransformers (sglang + kt-kernel). I followed the tutorial for DeepSeek V4 Flash and tweaked some options (NUMA, cores) for my hardware (Epyc 9374F + RTX PRO 6000 Max-Q). Then I ran llama-benchy with increasing context depth to check the performance.