Unveiling High-Probability Generalization in Decentralized SGD
arXiv:2605.10205v1 Announce Type: new
Abstract: Decentralized stochastic gradient descent (D-SGD) is an efficient method for large-scale distributed learning. Existing generalization studies mainly address expected results, achieving rates limited to …