cs.LG, math.DS

Time-Scale Coupling Between States and Parameters in Recurrent Neural Networks

arXiv:2508.12121v5 Announce Type: replace
Abstract: We show that gating mechanisms in recurrent neural networks (RNNs) induce lag-dependent and direction-dependent effective learning rates, even when training uses a fixed, global step size. This behav…