AI RESEARCH
Spatial-TTT: Streaming Visual-based Spatial Intelligence with Test-Time Training
arXiv CS.LG
•
ArXi:2603.12255v1 Announce Type: cross Humans perceive and understand real-world spaces through a stream of visual observations. Therefore, the ability to streamingly maintain and update spatial evidence from potentially unbounded video streams is essential for spatial intelligence. The core challenge is not simply longer context windows but how spatial information is selected, organized, and retained over time. In this paper, we propose Spatial-TTT towards streaming visual-based spatial intelligence with test-time.