AI RESEARCH
From SGD to Muon: Adaptive Optimization via Schatten-p Norms
arXiv CS.AI
•
ArXi:2605.19781v1 Announce Type: new Modern optimizers, like Muon, impose matrix-wise geometry constraints on their updates. These matrix-wise constraints can be unified under Linear Minimization Oracle (LMO) theory. However, all current methods impose fixed LMO geometries for the update rules, chosen by-design or empirically, which are not necessarily optimal according to the problem's geometry. We