Frontier Lag: A Bibliometric Audit of Capability Misrepresentation in Academic AI Evaluation
arXiv:2605.04135v1 Announce Type: cross
Abstract: Readers of applied-domain LLM capability evaluations want to know what AI systems can currently do. That literature answers a related, but consequentially different, question: what older, cheaper, less…