cs.LG, cs.NI

DBLP: Phase-Aware Bounded-Loss Transport for Burst-Resilient Distributed ML Training

arXiv:2605.01989v1 Announce Type: new
Abstract: Distributed machine learning (ML) training has become a necessity with the prevalence of billion to trillion-parameter-scale models. While prior work has improved training efficiency from the ML perspect…