Cole Walsh, Rodica Ivan

Measuring What Matters — or What’s Convenient?: Robustness of LLM-Based Scoring Systems to Construct-Irrelevant Factors

Cole Walsh, Rodica Ivan / March 27, 2026

arXiv:2603.25674v1 Announce Type: new
Abstract: Automated systems have been widely adopted across the educational testing industry for open-response assessment and essay scoring. These systems commonly achieve performance levels comparable to or super…

Author name: Cole Walsh, Rodica Ivan

Measuring What Matters — or What’s Convenient?: Robustness of LLM-Based Scoring Systems to Construct-Irrelevant Factors