Alex L. Zhang, Thomas L. Griffiths, Karthik R. Narasimhan, Ofir Press

VideoGameBench: Can Vision-Language Models complete popular video games?

Alex L. Zhang, Thomas L. Griffiths, Karthik R. Narasimhan, Ofir Press / May 18, 2026

arXiv:2505.18134v3 Announce Type: replace
Abstract: Vision-language models (VLMs) have achieved strong results on coding and math benchmarks that are challenging for humans, yet their ability to perform tasks that come naturally to humans–such as per…

Author name: Alex L. Zhang, Thomas L. Griffiths, Karthik R. Narasimhan, Ofir Press

VideoGameBench: Can Vision-Language Models complete popular video games?