AI RESEARCH

Does Your Optimizer Care How You Normalize? Normalization-Optimizer Coupling in LLM Training

arXiv CS.LG

ArXi:2604.01563v1 Announce Type: cross