Overthinking Causes Hallucination: Tracing Confounder Propagation in Vision Language Models

ArXi:2603.07619v1 Announce Type: new Vision Language models (VLMs) often hallucinate non-existent objects. Detecting hallucination is analogous to detecting deception: a single final statement is insufficient, one must examine the underlying reasoning process. Yet existing detectors rely mostly on final-layer signals. Attention-based methods assume hallucinated tokens exhibit low attention, while entropy-based ones use final-step uncertainty.