LLMs Do Not Grade Essays Like Humans
arXiv:2603.23714v1 Announce Type: new
Abstract: Large language models have recently been proposed as tools for automated essay scoring, but their agreement with human grading remains unclear. In this work, we evaluate how LLM-generated scores compare …