Tairan Fu, Javier Conde, Gonzalo Mart\'inez, Mar\'ia Grandury, Pedro Reviriego

Multiple Choice Questions: Reasoning Makes Large Language Models (LLMs) More Self-Confident, Especially When They are Wrong

Tairan Fu, Javier Conde, Gonzalo Mart\'inez, Mar\'ia Grandury, Pedro Reviriego / May 5, 2026

arXiv:2501.09775v3 Announce Type: replace-cross
Abstract: Multiple Choice Question (MCQ) tests are among the most used methods for evaluating large language models (LLMs). Besides checking the correctness of the selected answer, evaluations often cons…

Author name: Tairan Fu, Javier Conde, Gonzalo Mart\'inez, Mar\'ia Grandury, Pedro Reviriego

Multiple Choice Questions: Reasoning Makes Large Language Models (LLMs) More Self-Confident, Especially When They are Wrong