AI RESEARCH
Unveiling m-Sharpness Through the Structure of Stochastic Gradient Noise
arXiv CS.LG
•
ArXi:2509.18001v5 Announce Type: replace Sharpness-aware minimization (SAM) has emerged as a highly effective technique to improve model generalization, but its underlying principles are not fully understood. We investigate m-sharpness, where SAM performance improves monotonically as the micro-batch size for computing perturbations decreases, a phenomenon critical for distributed