cs.AI, cs.CL

DiffScore: Text Evaluation Beyond Autoregressive Likelihood

arXiv:2605.11601v1 Announce Type: new
Abstract: Autoregressive language models are widely used for text evaluation, however, their left-to-right factorization introduces positional bias, i.e., early tokens are scored with only leftward context, confla…