AI RESEARCH
SOLAR: Self-supervised Joint Learning for Symmetric Multimodal Retrieval
arXiv CS.CV
•
ArXi:2605.15868v1 Announce Type: new In this work, we address the critical yet underexplored challenge of symmetric multimodal-to-multimodal (MM2MM) retrieval, where queries and contexts are interchangeable. Existing universal multimodal retrieval works struggle with this task, as they are constrained by the labeled asymmetric datasets used. We produce SOLAR (Self-supervised jOint LeArning for symmetric multimodal Retrieval), a novel two-stage self-supervised framework that leverages readily available unlabeled web-scale image-text pairs.