How to fix prompt reprocessing in qwen3.5 models (instruct mode only)
r/LocalLLaMA
•
Generative AI
Open Source AI
Quick disclaimer: this only applies to instruct mode (thinking disabled). If you're using thinking, the template will still behave like the default. I was running Qwen 3.5 in llama.cpp with thinking disabled and noticed it was reprocessing the last message on every turn instead of picking up from where it left off. The culprit is in the default Jinja chat template. When you disable thinking, the template injects an empty think block before generation: \n\n \n\n.