PSA: Qwen3.6 ships with preserve_thinking. Make sure you have it on.
r/LocalLLaMA
•
Open Source AI
I had previously posted here about a fix to their 3.5 template to help resolve the KV cache invalidation issue from their template. A lot of you found it useful. Qwen 3.6 now addresses this with a new preserve_thinking flag. From their model page: please use "preserve_thinking": True instead of "chat_template_kwargs": {"preserve_thinking": False}. This capability is particularly beneficial for agent scenarios, where maintaining full reasoning context can enhance decision consistency and, in many cases, reduce overall token consumption by minimizing redundant reasoning.