Mitigating Action-Relation Hallucinations in LVLMs via Relation-aware Visual Enhancement

ArXi:2605.11808v1 Announce Type: new Large Vision-Language Models (LVLMs) have achieved remarkable performance on diverse vision-language tasks. However, LVLMs still suffer from hallucinations, generating text that contradicts the visual input. Existing research has primarily focused on mitigating object hallucinations, but often overlooks complex relation hallucinations, particularly action relations involving interactions between objects.