cs.AI, cs.LG, math.OC, stat.CO, stat.ML

The Effect of Mini-Batch Noise on the Implicit Bias of Adam

arXiv:2602.01642v2 Announce Type: replace-cross
Abstract: With limited high-quality data and growing compute, multi-epoch training is gaining back its importance across sub-areas of deep learning. Adam(W), versions of which are go-to optimizers for ma…