AI RESEARCH
MathFlow: Enhancing the Perceptual Flow of MLLMs for Visual Mathematical Problems
arXiv CS.CV
•
ArXi:2503.16549v2 Announce Type: replace Despite strong results on many tasks, multimodal large language models (MLLMs) still underperform on visual mathematical problem solving, especially in reliably perceiving and interpreting diagrams. Inspired by human problem-solving, we hypothesize that the ability to extract meaningful information from diagrams is pivotal, as it directly conditions subsequent inference. Hence, we