AI RESEARCH

The Origin of Edge of Stability

arXiv CS.LG

ArXi:2604.20446v1 Announce Type: new Full-batch gradient descent on neural networks drives the largest Hessian eigenvalue to the threshold $2/\eta$, where $\eta$ is the learning rate. This phenomenon, the Edge of Stability, has resisted a unified explanation: existing accounts establish self-regulation near the edge but do not explain why the trajectory is forced toward $2/\eta$ from arbitrary initialization. We