AI RESEARCH
Constructive Distortion: Improving MLLMs with Attention-Guided Image Warping
arXiv CS.LG
•
ArXi:2510.09741v3 Announce Type: replace-cross Multimodal large language models (MLLMs) often miss small details and spatial relations in cluttered scenes, leading to errors in fine-grained perceptual grounding. We