AI RESEARCH
Audio-to-Image Bird Species Retrieval without Audio-Image Pairs via Text Distillation
arXiv CS.LG
•
ArXi:2602.00681v2 Announce Type: replace-cross Audio-to-image retrieval offers an interpretable alternative to audio-only classification for bioacoustic species recognition, but learning aligned audio-image representations is challenging due to the scarcity of paired audio-image data. We propose a simple and data-efficient approach that enables audio-to-image retrieval without any audio-image supervision.