Convergent Evolution: How Different Language Models Learn Similar Number Representations
arXiv:2604.20817v1 Announce Type: cross
Abstract: Language models trained on natural text learn to represent numbers using periodic features with dominant periods at $T=2, 5, 10$. In this paper, we identify a two-tiered hierarchy of these features: wh…