HumanLLM: Benchmarking and Improving LLM Anthropomorphism via Human Cognitive Patterns

arXiv:2601.10198v4 Announce Type: replace Abstract: Large Language Models (LLMs) have demonstrated remarkable capabilities in reasoning and generation, serving as the foundation for advanced persona simulation and Role-Playing Language Agents (RPLAs). However, achieving authentic alignment with human cognitive and behavioral patterns remains a critical challenge for these agents. We present HumanLLM, a framework treating psychological patterns as interacting causal forces. We construct 244 patterns from $\sim$12,000 academic papers and synthesize 11,359 scenarios where 2-5 patterns reinforce, conflict, or modulate each other, with multi-turn conversations expressing inner thoughts, actions, and dialogue. Our dual-level checklists evaluate both individual pattern fidelity and emergent multi-pattern dynamics, achieving strong human alignment ($r=0.90$) while revealing that holistic metrics conflate simulation accuracy with social desirability. HumanLLM-8B outperforms Qwen3-32B on multi-pattern dynamics despite 4$\times$ fewer parameters, demonstrating that authentic anthropomorphism requires cognitive modeling -- simulating not just what humans do, but the psychological processes generating those behaviors. Our dataset, code, and model are available at:https://github.com/YJGoodbye2024/HumanLLM

Leave a Comment