AI RESEARCH
GeoSense: Internalizing Geometric Necessity Perception for Multimodal Reasoning
arXiv CS.CV
•
ArXi:2603.10370v1 Announce Type: new Advancing towards artificial superintelligence requires rich and intelligent perceptual capabilities. A critical frontier in this pursuit is overcoming the limited spatial understanding of Multimodal Large Language Models (MLLMs), where geometry information is essential. Existing methods often address this by rigidly injecting geometric signals into every input, while ignoring their necessity and adding computation overhead.