We evaluated the spatial grounding capabilities of ChatGPT, Gemini, and Perplexity (API) by querying 100 US cities and 5 cuisine types. Using the Google Places API as ground truth, we measured hallucination rates, "permanently closed" retrieval errors, and distance-from-center accuracy. This became a City IQ Score.
Key Findings
- Chicago Ranked #1: AI scored Chicago the best for overall restaurant accuracy. (City IQ = 89)
- Staleness: ~600 recommendations were for businesses closed, clear training data latency.
- Spatial Drift - 1078 picks were in the wrong city entirely.
Methodology
City IQ is a 100-point composite: Existence Rate (30pts), Cuisine Accuracy (20pts), Independence Rate (20pts), Bayesian Quality (20pts), Location Accuracy (10pts) — computed per city across all verified recommendations. Bayesian scoring was used for top picks (Google rating weighted by review count vs. dataset mean). Interesting to see what a machine recommends for food choice. Along with accuracy and frequency.
Full Report & Dataset:
https://aiagentsbuzz.com/research/ai-restaurant-recommendations.html
[link] [comments]