AI RESEARCH
Enhancing SignSGD: Small-Batch Convergence Analysis and a Hybrid Switching Strategy
arXiv CS.LG
•
ArXi:2604.25550v1 Announce Type: new SignSGD compresses each stochastic gradient coordinate to a single bit, offering substantial memory and communication savings, but its 1-bit quantization removes magnitude information and is known to leave a generalization gap relative to well-tuned SGD. We revisit SignSGD from a 1-bit quantization and dithering perspective and contribute three improvements.