cs.LG

Unsupervised Confidence Calibration for Reasoning LLMs from a Single Generation

arXiv:2604.19444v1 Announce Type: new
Abstract: Reasoning language models can solve increasingly complex tasks, but struggle to produce the calibrated confidence estimates necessary for reliable deployment. Existing calibration methods usually depend …