AI RESEARCH

Learning to Focus and Precise Cropping: A Reinforcement Learning Framework with Information Gaps and Grounding Loss for MLLMs

arXiv CS.AI

ArXi:2603.27494v1 Announce Type: cross To enhance the perception and reasoning capabilities of multimodal large language models in complex visual scenes, recent research has