80% of LLM 'Thinking' Is a Lie — What CoT Faithfulness Research Actually Shows

When You're Reading CoT, the Model Is Thinking Something Else Thinking models are everywhere now. DeepSeek-R1, Claude 3.7 Sonnet, Qwen3.5 - models that show you their reasoning process keep multiplying. When I run Qwen3.5-9B on an RTX 4060, the thinking block spills out lines of internal reasoning. "Wait, let me reconsider. " "Actually, this approach is better. " - it self-debates its way to an answer. It feels reassuring. You think: okay, it's actually thinking this through. That reassurance has no foundation.