Perceptual Flow Network for Visually Grounded Reasoning

ArXi:2605.02730v1 Announce Type: new Despite the success of Large-Vision Language Models (LVLMs), general optimization objectives (e.g., standard MLE) fail to constrain visual trajectories, leading to language bias and hallucination. To mitigate this, current methods