Michael Krumdick, Charles Lovering, Varshini Reddy, Seth Ebner, Chris Tanner

No Free Labels: Limitations of LLM-as-a-Judge Without Human Grounding

Michael Krumdick, Charles Lovering, Varshini Reddy, Seth Ebner, Chris Tanner / April 4, 2026

arXiv:2503.05061v2 Announce Type: replace
Abstract: Reliable evaluation of large language models (LLMs) is critical as their deployment rapidly expands, particularly in high-stakes domains such as business and finance. The LLM-as-a-Judge framework, wh…

Author name: Michael Krumdick, Charles Lovering, Varshini Reddy, Seth Ebner, Chris Tanner

No Free Labels: Limitations of LLM-as-a-Judge Without Human Grounding