AI RESEARCH
From Pixels to BFS: High Maze Accuracy Does Not Imply Visual Planning
arXiv CS.LG
•
ArXi:2603.26839v1 Announce Type: new How do multimodal models solve visual spatial tasks -- through genuine planning, or through brute-force search in token space? We