Evaluating LLM Spatial Grounding: A 100-City Audit of 7,000+ Restaurant Recommendations vs. Google Places for Ground Truth [R]

We evaluated the spatial grounding capabilities of ChatGPT, Gemini, and Perplexity (API) by querying 100 US cities and 5 cuisine types. Using the Google Places API as ground truth, we measured hallucination rates, "permanently closed" retrieval errors, and distance-from-center accuracy. This became a City IQ Score. Key Findings Chicago Ranked: AI scored Chicago the best for overall restaurant accuracy. (City IQ = 89) Staleness: ~600 recommendations were for businesses closed, clear