cs.AI, cs.CL

PDDL-Mind: Large Language Models are Capable on Belief Reasoning with Reliable State Tracking

arXiv:2604.17819v1 Announce Type: new
Abstract: Large language models (LLMs) perform substantially below human level on existing theory-of-mind (ToM) benchmarks, even when augmented with chain-of-thought prompting or probabilistic belief updates. We a…