Autoscoring Anticlimax: A Meta-analytic Understanding of AI’s Short-answer Shortcomings and Wording Weaknesses
arXiv:2603.04820v2 Announce Type: replace
Abstract: Automated short-answer scoring lags other LLM applications. We meta-analyze 890 culminating results across a systematic review of LLM short-answer scoring studies, modeling the traditional effect siz…