AI RESEARCH
Audio-Visual Camera Pose Estimation with Passive Scene Sounds and In-the-Wild Video
arXiv CS.CV
•
ArXi:2512.12165v3 Announce Type: replace Understanding camera motion is a fundamental problem in embodied perception and 3D scene understanding. While visual methods have advanced rapidly, they often struggle under visually degraded conditions such as motion blur or occlusions. In this work, we show that passive scene sounds provide cues complementary to vision for relative camera pose estimation for in-the-wild videos. We