Author name: Yutong Gao, Qinglin Meng, Yuan Zhou, Liangming Pan

Towards Intrinsic Interpretability of Large Language Models:A Survey of Design Principles and Architectures

Yutong Gao, Qinglin Meng, Yuan Zhou, Liangming Pan / April 21, 2026

arXiv:2604.16042v2 Announce Type: cross
Abstract: While Large Language Models (LLMs) have achieved strong performance across many NLP tasks, their opaque internal mechanisms hinder trustworthiness and safe deployment. Existing surveys in explainable A…

cs.AI, cs.CL, cs.LG

Towards Intrinsic Interpretability of Large Language Models:A Survey of Design Principles and Architectures

Yutong Gao, Qinglin Meng, Yuan Zhou, Liangming Pan / April 20, 2026

arXiv:2604.16042v1 Announce Type: cross
Abstract: While Large Language Models (LLMs) have achieved strong performance across many NLP tasks, their opaque internal mechanisms hinder trustworthiness and safe deployment. Existing surveys in explainable A…