Feature Learning Dynamics in Infinite-Depth Neural Networks

arXiv:2512.21075v2 Announce Type: replace-cross Abstract: Deep neural networks have achieved remarkable success in practice, yet a mechanistic understanding of how features evolve during training remains incomplete, especially in the large-depth limit. For ResNets under depth-$\mu$P scaling, prior work treats the layer index $\ell$ as a continuous time $t_\ell = \ell/L$, yielding SDE descriptions of the training dynamics. A key unresolved issue is that backpropagation reuses each forward weight matrix $W_\ell$ through its transpose $W_\ell^\top$, creating correlations between forward features and backward gradients whose behavior and role in feature learning remain unclear. We study this reused-weight forward--backward coupling in one-layer ResNets under depth-$\mu$P. Using conditional Gaussian representations, we explicitly separate the coupling terms induced by weight reuse from decoupled Gaussian fluctuations before taking any network limit. At initialization, we prove that the coupling is a finite-width effect and vanishes at rate $O(n^{-1})$, uniformly over depth. During training, however, SGD induces a nontrivial forward--backward correlation term that survives the infinite-width limit. The key depth effect is that, under depth-$\mu$P scaling, this surviving term is higher order in depth and its accumulated contribution over layers becomes negligible as $L\to\infty$. This depth-induced suppression motivates Neural Feature Dynamics (NFD), a forward--backward SDE system with decoupled backward weights that retains the feature-gradient covariance structure generated during training. Under nondegeneracy assumptions, we prove that the finite-network training dynamics converge to its NFD limit with an $O(L^{-1})$ depth-discretization error, while the reused-weight coupling term has a faster $O(L^{-2})$ decay. These results provide a rigorous infinite-depth limit for the feature-learning dynamics of one-layer ResNets under depth-$\mu$P.

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top