Hui Sun, Yun-Ji Zhang, Zheng Xie, Ren-Biao Liu, Yali Du, Xin-Ye Li, Ming Li

ACES: Who Tests the Tests? Leave-One-Out AUC Consistency for Code Generation

Hui Sun, Yun-Ji Zhang, Zheng Xie, Ren-Biao Liu, Yali Du, Xin-Ye Li, Ming Li / April 7, 2026

arXiv:2604.03922v1 Announce Type: new
Abstract: Selecting LLM-generated code candidates using LLM-generated tests is challenging because the tests themselves may be incorrect. Existing methods either treat all tests equally or rely on ad-hoc heuristic…

Author name: Hui Sun, Yun-Ji Zhang, Zheng Xie, Ren-Biao Liu, Yali Du, Xin-Ye Li, Ming Li

ACES: Who Tests the Tests? Leave-One-Out AUC Consistency for Code Generation