AI RESEARCH

Distance-Aware Muon: Adaptive Step Scaling for Normalized Optimization

arXiv CS.LG

ArXi:2605.18999v1 Announce Type: new Muon and related normalized optimizers decouple the choice of update direction from the choice of step scale, but their practical performance remains sensitive to the scale of the normalized step. We study adaptive scaling rules for Muon in general norm geometries and develop three complementary algorithms. For smooth non-convex objectives, we