The Impact of LLM Self-Consistency and Reasoning Effort on Automated Scoring Accuracy and Cost
arXiv:2604.26954v1 Announce Type: cross
Abstract: Strategic model selection and reasoning settings are more effective than ensembling for optimizing automated scoring with large language models (LLMs). We examined self-consistency (intra-model majorit…