GeoRC: A Benchmark for Geolocation Reasoning Chains

ArXi:2601.21278v2 Announce Type: replace-cross Vision Language Models (VLMs) are good at recognizing the global location of a photograph -- their geolocation prediction accuracy rivals the best human experts. But many VLMs are startlingly bad at \textit{explaining} which image evidence led to their prediction, even when their location prediction is correct. In this paper, we