AI RESEARCH
Holi-Spatial: Evolving Video Streams into Holistic 3D Spatial Intelligence
arXiv CS.CV
•
ArXi:2603.07660v1 Announce Type: new The pursuit of spatial intelligence fundamentally relies on access to large-scale, fine-grained 3D data. However, existing approaches predominantly construct spatial understanding benchmarks by generating question-answer (QA) pairs from a limited number of manually annotated datasets, rather than systematically annotating new large-scale 3D scenes from raw web data. As a result, their scalability is severely constrained, and model performance is further hindered by domain gaps inherent in these narrowly curated datasets.