AI RESEARCH
Mind over Space: Can Multimodal Large Language Models Mentally Navigate?
arXiv CS.AI
•
ArXi:2603.21577v1 Announce Type: new Despite the widespread adoption of MLLMs in embodied agents, their capabilities remain largely confined to reactive planning from immediate observations, consistently failing in spatial reasoning across extensive spatiotemporal scales. Cognitive science reveals that Biological Intelligence (BI) thrives on "mental navigation": the strategic construction of spatial representations from experience and the subsequent mental simulation of paths prior to action. To bridge the gap between AI and BI, we.