cs.AI, cs.CL, cs.CY

Do Small Language Models Know When They’re Wrong? Confidence-Based Cascade Scoring for Educational Assessment

arXiv:2604.19781v1 Announce Type: cross
Abstract: Automated scoring of student work at scale requires balancing accuracy against cost and latency. In “cascade” systems, small language models (LMs) handle easier scoring tasks while escalating harder on…