Anyone want to try my llama.cpp DeepSeek V3.2 PR?

r/LocalLLaMA
Generative AI AI Hardware Open Source AI

Code: git clone -b deepseek-dsa --single-branch ed GGUFs (Q4_K_M ~ 404GB, Q8_0 ~ 714GB): Chat template to use: models/templates/deepseek-ai-DeepSeek-V3.2.jinja If you experience OOM errors in CUDA ggml_top_k try lowering the ubatch size or/and increasing `-fitt` value. Let me know if you encounter any problems. submitted by /u/fairydreaming [link] [comments]