rtx 5070ti with Qwen3.6-35B-A3B-Uncensored-HauhauCS-Aggressive-Q4_K_M.gguf token speed 564/41

r/LocalLLaMA
AI Hardware

--model "/mnt/e/my-path-change-to-yours/qwen3.6-35b/Qwen3.6-35B-A3B-Uncensored-HauhauCS-Aggressive-Q4_K_M.gguf" \ --ctx-size 262144 \ --parallel 1 \ --n-cpu-moe 29 \ --no-mmap \ --mlock \ --cache-type-k q4_0 \ --cache-type- q4_0 10.8/16 dedicated VRAM ( Need place for windows and game engines) 13.6/15.6 shared RAM 23.5/32GB Normal Ram ( Windows, Chrome, WSL setup so other stuff also leeching it) this channel is the real hero. They make it work on 6gb GPU FFS. Btw as you can see I couldn't use TurboQuants.