Nathaniel Getachew, Abulhair Saparov

Language Models Might Not Understand You: Evaluating Theory of Mind via Story Prompting

Nathaniel Getachew, Abulhair Saparov / April 28, 2026

arXiv:2506.19089v5 Announce Type: replace
Abstract: We introduce StorySim, a programmable framework for synthetically generating stories to evaluate the theory of mind (ToM) and world modeling (WM) capabilities of large language models (LLMs). Unlike …

Author name: Nathaniel Getachew, Abulhair Saparov

Language Models Might Not Understand You: Evaluating Theory of Mind via Story Prompting