AI RESEARCH
Generate, Analyze, and Refine: Training-Free Sound Source Localization via MLLM Meta-Reasoning
arXiv CS.CV
•
ArXi:2604.06824v1 Announce Type: new Sound source localization task aims to identify the locations of sound-emitting objects by leveraging correlations between audio and visual modalities. Most existing SSL methods rely on contrastive learning-based feature matching, but lack explicit reasoning and verification, limiting their effectiveness in complex acoustic scenes. Inspired by human meta-cognitive processes, we propose a