Enhancing SignSGD: Small-Batch Convergence Analysis and a Hybrid Switching Strategy
arXiv:2604.25550v1 Announce Type: new
Abstract: SignSGD compresses each stochastic gradient coordinate to a single bit, offering substantial memory and communication savings, but its 1-bit quantization removes magnitude information and is known to lea…