Deniz Bayazit, Aaron Mueller, Antoine Bosselut

Crosscoding Through Time: Tracking Emergence & Consolidation Of Linguistic Representations Throughout LLM Pretraining

Deniz Bayazit, Aaron Mueller, Antoine Bosselut / May 1, 2026

arXiv:2509.05291v2 Announce Type: replace-cross
Abstract: Large language models (LLMs) learn non-trivial abstractions during pretraining, such as detecting irregular plural noun subjects. However, because traditional evaluation methods (e.g., benchmar…

Author name: Deniz Bayazit, Aaron Mueller, Antoine Bosselut

Crosscoding Through Time: Tracking Emergence & Consolidation Of Linguistic Representations Throughout LLM Pretraining