AI RESEARCH
Semantic Audio-Visual Navigation in Continuous Environments
arXiv CS.CV
•
ArXi:2603.19660v1 Announce Type: new Audio-visual navigation enables embodied agents to navigate toward sound-emitting targets by leveraging both auditory and visual cues. However, most existing approaches rely on precomputed room impulse responses (RIRs) for binaural audio rendering, restricting agents to discrete grid positions and leading to spatially discontinuous observations. To establish a realistic setting, we