The Geometric Anatomy of Capability Acquisition in Transformers

arXiv:2602.15997v4 Announce Type: replace-cross Abstract: Neural networks gain capabilities during training, but the internal changes that precede capability acquisition are not well understood. In particular, the relationship between geometric change and behavioral change, and the effect of task difficulty and model scale on that relationship, is unclear. We track geometric measures and linear probes across six transformer sizes (405K--151M parameters), eight algorithmic tasks (144 task$\times$level$\times$model combinations), and three Pythia language models (160M--2.8B). Across all settings, representations first collapse to a low-dimensional state, then recover, and only then does behavioral performance improve. Linear probes show that the model's hidden states already contain task-relevant information before the model can act on it. The collapse floor is task-specific, the collapse propagates top-down through the network, and of the geometric measures tested, only \rankme reliably precedes capability acquisition for hard tasks. Whether this precursor is detectable depends on task difficulty relative to model capacity. For hard tasks, there is a clear gap: geometry changes first, behavior follows. For easy tasks, the model learns so quickly that both happen simultaneously and no precursor is detectable. On Pythia-2.8B, a logical deduction task that is genuinely hard for the model shows a precursor gap of ${\sim}$49K training steps, while easy benchmarks show none. This suggests that geometric patterns observed in small proxy models can persist at larger scale when the task remains difficult relative to model capacity.

Leave a Comment