Online Self-Calibration Against Hallucination in Vision-Language Models

ArXi:2605.00323v1 Announce Type: cross Large Vision-Language Models (LVLMs) often suffer from hallucinations, generating descriptions that include visual details absent from the input image. Recent preference alignment methods typically rely on supervision distilled from stronger models such as GPT. However, this offline paradigm