ProEval: Proactive Failure Discovery and Efficient Performance Estimation for Generative AI Evaluation
arXiv:2604.23099v1 Announce Type: cross
Abstract: Evaluating generative AI models is increasingly resource-intensive due to slow inference, expensive raters, and a rapidly growing landscape of models and benchmarks. We propose ProEval, a proactive eva…