See, Remember, Explore: A Benchmark and Baselines for Streaming Spatial Reasoning

ArXi:2603.23864v1 Announce Type: new Spatial understanding is fundamental for embodied agents, yet most spatial VLMs and benchmarks remain offline-evaluating post-hoc QA over pre-recorded inputs and overlooking two crucial deployment-critical requirements: long-horizon streaming inference and active perception when the current view is insufficient. To address this gap, we