cond-mat.dis-nn, cs.LG, stat.ML

Theory of Optimal Learning Rate Schedules and Scaling Laws for a Random Feature Model

arXiv:2602.04774v2 Announce Type: replace-cross
Abstract: Setting the learning rate (LR) for a deep learning model is a critical part of successful training. Choosing LRs is often done empirically with trial and error. In this work, we explore a solva…