AI RESEARCH

Understanding Quantization of Optimizer States in LLM Pre-training: Dynamics of State Staleness and Effectiveness of State Resets

arXiv CS.LG

ArXi:2603.16731v1 Announce Type: new Quantizing optimizer states is becoming an important ingredient of memory-efficient large-scale pre-