AI RESEARCH

Muon-OGD: Muon-based Spectral Orthogonal Gradient Projection for LLM Continual Learning

arXiv CS.LG

ArXi:2605.08949v1 Announce Type: new A central challenge in continual learning for large language models (LLMs) is catastrophic forgetting, where adapting to new tasks can substantially degrade performance on previously learned ones. Existing projection-based methods mitigate such interference by restricting parameter updates to subspaces that are orthogonal to directions associated with past tasks. However, these methods are typically formulated under Euclidean parameter geometry, with update magnitudes and projections governed by the Frobenius norm.