Mengfan Li, Xuanhua Shi, Yang Deng

CoSToM:Causal-oriented Steering for Intrinsic Theory-of-Mind Alignment in Large Language Models

Mengfan Li, Xuanhua Shi, Yang Deng / April 14, 2026

arXiv:2604.10031v1 Announce Type: new
Abstract: Theory of Mind (ToM), the ability to attribute mental states to others, is a hallmark of social intelligence. While large language models (LLMs) demonstrate promising performance on standard ToM benchmar…

Author name: Mengfan Li, Xuanhua Shi, Yang Deng

CoSToM:Causal-oriented Steering for Intrinsic Theory-of-Mind Alignment in Large Language Models