The Model Says Walk: How Surface Heuristics Override Implicit Constraints in LLM Reasoning

ArXi:2603.29025v1 Announce Type: cross Large language models systematically fail when a salient surface cue conflicts with an unstated feasibility constraint. We study this through a diagnose-measure-bridge-treat framework. Causal-behavioral analysis of the ``car wash problem'' across six models reveals approximately context-independent sigmoid heuristics: the distance cue exerts 8.7 to 38 times influence than the goal, and token-level attribution shows patterns consistent with keyword associations than compositional inference.