Take Out Your Calculators: Estimating the Real Difficulty of Question Items with LLM Student Simulations
arXiv:2601.09953v2 Announce Type: replace
Abstract: Standardized math assessments require expensive human pilot studies to establish the difficulty of test items. We investigate the predictive value of open-source large language models (LLMs) for eval…