cs.AI, cs.CL

LLMs Do Not Grade Essays Like Humans

arXiv:2603.23714v1 Announce Type: new
Abstract: Large language models have recently been proposed as tools for automated essay scoring, but their agreement with human grading remains unclear. In this work, we evaluate how LLM-generated scores compare …