Shanghua Gao, Yuchang Su, Pengwei Sui, Curtis Ginder, Marinka Zitnik

Qworld: Question-Specific Evaluation Criteria for LLMs

Shanghua Gao, Yuchang Su, Pengwei Sui, Curtis Ginder, Marinka Zitnik / March 26, 2026

arXiv:2603.23522v1 Announce Type: cross
Abstract: Evaluating large language models (LLMs) on open-ended questions is difficult because response quality depends on the question’s context. Binary scores and static rubrics fail to capture these context-d…

Author name: Shanghua Gao, Yuchang Su, Pengwei Sui, Curtis Ginder, Marinka Zitnik

Qworld: Question-Specific Evaluation Criteria for LLMs