AI RESEARCH

Unveiling m-Sharpness Through the Structure of Stochastic Gradient Noise

arXiv CS.LG

ArXi:2509.18001v5 Announce Type: replace Sharpness-aware minimization (SAM) has emerged as a highly effective technique to improve model generalization, but its underlying principles are not fully understood. We investigate m-sharpness, where SAM performance improves monotonically as the micro-batch size for computing perturbations decreases, a phenomenon critical for distributed