Jasmine Qi, Danylo Dantsev, Muyang Sun

VERDI: Single-Call Confidence Estimation for Verification-Based LLM Judges via Decomposed Inference

Jasmine Qi, Danylo Dantsev, Muyang Sun / May 13, 2026

arXiv:2605.11334v1 Announce Type: cross
Abstract: LLM-as-Judge systems are widely deployed for automated evaluation, yet practitioners lack reliable methods to know when a judge’s verdict should be trusted. Token log-probabilities, the standard post-h…

Author name: Jasmine Qi, Danylo Dantsev, Muyang Sun

VERDI: Single-Call Confidence Estimation for Verification-Based LLM Judges via Decomposed Inference