E-Scores for (In)Correctness Assessment of Generative Model Outputs
arXiv:2510.25770v2 Announce Type: replace-cross
Abstract: While generative models, especially large language models (LLMs), are ubiquitous in today’s world, principled mechanisms to assess their (in)correctness are limited. Using the conformal predict…