Author name: Ishani Mondal, Shweta Bhardwaj

Benchmarked Yet Not Measured — Generative AI Should be Evaluated Against Real-World Utility

Ishani Mondal, Shweta Bhardwaj / May 12, 2026

arXiv:2605.06856v2 Announce Type: replace-cross
Abstract: Generative AI systems achieve impressive performance on standard benchmarks yet fail to deliver real-world utility, a disconnect we identify across 28 deployment cases spanning education, healt…

cs.CL, cs.LG

Benchmarked Yet Not Measured — Generative AI Should be Evaluated Against Real-World Utility

Ishani Mondal, Shweta Bhardwaj / May 11, 2026

arXiv:2605.06856v1 Announce Type: cross
Abstract: Generative AI systems achieve impressive performance on standard benchmarks yet fail to deliver real-world utility, a disconnect we identify across 28 deployment cases spanning education, healthcare, s…