AI RESEARCH

SpatialMem: Metric-Aligned Long-Horizon Video Memory for Language Grounding and QA

arXiv CS.AI

ArXi:2601.14895v2 Announce Type: replace-cross We present SpatialMem, a memory-centric system for long-horizon, language-grounded retrieval and QA from egocentric video, where metric 3D serves as an interpretable indexing scaffold rather than an explicit mapping objective.