cs.AI, cs.CL

Evaluating Language Models’ Evaluations of Games

arXiv:2510.10930v2 Announce Type: replace
Abstract: Reasoning is not just about solving problems — it is also about evaluating which problems are worth solving at all. Evaluations of artificial intelligence (AI) systems primarily focused on problem s…

Scroll to Top