cs.AI

Unsteady Metrics and Benchmarking Cultures of AI Model Builders

arXiv:2605.14164v1 Announce Type: new
Abstract: The primary way to establish and compare competencies in foundation and generative AI models has shifted from peer-reviewed literature to press releases and company blog posts, where model builders highl…