Help running Qwen3-Coder-Next TurboQuant (TQ3) model
r/LocalLLaMA
•
Generative AI
Open Source AI
I found a TQ3-quantized version of Qwen3-Coder-Next here: According to the page, this model requires a compatible inference engine that s TurboQuant. It also provides a command, but it doesn’t clearly specify which version or fork of llama.cpp should be used (or maybe I missed it). llama-server I’ve tried the following llama.cpp forks that claim to TQ3, but none of them worked for me: If anyone has successfully run this model, I’d really appreciate it if you could share how you did it. submitted by /u/UnluckyTeam3478 [link] [comments.