Linear Representations of Hierarchical Concepts in Language Models
arXiv:2604.07886v3 Announce Type: replace
Abstract: We investigate how and to what extent hierarchical relations (e.g., Japan $\subset$ Eastern Asia $\subset$ Asia) are encoded in the internal representations of language models. Building on Linear Rel…