AI RESEARCH
Multiple Consistent 2D-3D Mappings for Robust Zero-Shot 3D Visual Grounding
arXiv CS.CV
•
ArXi:2604.26261v1 Announce Type: new Zero-shot 3D Visual Grounding (3DVG) is a critical capability for open-world embodied AI. However, existing methods are fundamentally bottlenecked by the poor quality of open-vocabulary 3D proposals, suffering from inaccurate categories and imprecise geometries, as well as the spatial redundancy of exhaustive multi-view reasoning. To address these challenges, we propose MCM-VG, a novel framework that achieves robust zero-shot 3DVG by explicitly establishing Multiple Consistent 2D-3D Mappings.