cs.CL, cs.LG

Calibrating LLMs with Semantic-level Reward

arXiv:2605.15588v1 Announce Type: new
Abstract: As large language models (LLMs) are deployed in consequential settings such as medical question answering and legal reasoning, the ability to estimate when their outputs are likely to be correct is essen…