What I got by 5060Ti 16GB + Qwen3.6-35B-A3B-UD-Q5_K_M
r/LocalLLaMA
•
Generative AI
AI Hardware
Open Source AI
AI Research
I tried local model couple weeks ago. At the beginning, I tried Ollama, but reddit says better to switch to llama.ccp. then I switched to llama.ccp prebuild, it was amazing, I was very happy with llama.ccp, speed almostly doubled to run Qwen3.5 9 Q8_K_M, and Qwen3.5 35B-A3B Q4_K_M. This week, Chatgpt and Gemini suggests me to build llama.cpp by on my PC to get max optimization. I did it, and result made me happy again, almost 10% improved. HW: CPU: AMD 9700x GPU: 5060 Ti 16GB RAM: 16GB *2 Here the result: It's confused to see qwen 35moe 35B.