I have DeepSeek V4 Pro at home
r/LocalLLaMA
•
Generative AI
AI Hardware
Open Source AI
Just wanted to share that I used u/LegacyRemaster slightly modified (Q4_K_M conversion ) DeepSeek V4 CUDA repo (based on u/antirez work ) to convert and run Q4_K_M DeepSeek V4 Pro on my Epyc workstation (Genoa 9374F, 12 x 96GB RAM, single RTX PRO 6000 Max-Q) and it worked right from the start: (base) phm:~/projects/llama.cpp-deepseek-v4-flash-cuda/build-cuda$./bin/llama-cli -m./models/DeepSeek-V4-Pro-Q4_K_M.gguf --no-repack -ub 128 --chat-template-file./models/templates/deepseek-ai-DeepSeek-V3.2.jinja ggml_cuda_init: found 1 CUDA devices (Total VRAM: 97247 MiB): Device 0.