AI RESEARCH
Understanding Quantization of Optimizer States in LLM Pre-training: Dynamics of State Staleness and Effectiveness of State Resets
arXiv CS.LG
•
ArXi:2603.16731v1 Announce Type: new Quantizing optimizer states is becoming an important ingredient of memory-efficient large-scale pre-