AI RESEARCH

Pre-training LLM without Learning Rate Decay Enhances Supervised Fine-Tuning

arXiv CS.LG • March 18, 2026

ArXi:2603.16127v1 Announce Type: cross We investigate the role of learning rate scheduling in the large-scale pre-