GeoBridge: A Semantic-Anchored Multi-View Foundation Model Bridging Images and Text for Geo-Localization

ArXi:2512.02697v2 Announce Type: replace Cross-view geo-localization infers a location by retrieving geo-tagged reference images that visually correspond to a query image. However, the traditional satellite-centric paradigm limits robustness when high-resolution or up-to-date satellite imagery is unavailable. It further underexploits complementary cues across views (\eg, drone, satellite, and street) and modalities (\eg, language and image). To address these challenges, we propose GeoBridge, a novel model that performs bidirectional matching across views and s language-to-image retrieval.