cs.LG

Taking the Road Less Scheduled with Adaptive Polyak Steps

arXiv:2511.07767v2 Announce Type: replace
Abstract: Schedule-Free SGD, proposed in [Defazio et al., 2024], achieves optimal convergence rates without requiring the training horizon in advance, by replacing learning rate schedules with a principled for…