When Language Overwrites Vision: Over-Alignment and Geometric Debiasing in Vision-Language Models

ArXi:2605.08245v1 Announce Type: cross Vision-Language Models (VLMs) increasingly power high-stakes applications, from medical imaging to autonomous systems, yet they routinely hallucinate, confidently describing content not present in the input. We investigate the root causes of these failure modes with a mechanistic analysis focusing on the decoder-based VLMs.