Understanding the Staged Dynamics of Transformers in Learning Latent Structure
arXiv:2511.19328v2 Announce Type: replace
Abstract: Language modeling has shown us that transformers can discover latent structure from context, but the dynamics of how they acquire different components of that structure remain poorly understood, lead…