MAP: Mitigating Hallucinations in Large Vision-Language Models with Map-Level Attention Processing

ArXi:2508.01653v2 Announce Type: replace-cross Large Vision-Language Models (LVLMs) have achieved impressive performance in multimodal tasks, but they still suffer from hallucinations, i.e., generating content that is grammatically accurate but inconsistent with visual inputs. In this work, we