A Deep Dive into Scaling RL for Code Generation with Synthetic Data and Curricula

ArXi:2603.24202v1 Announce Type: new Reinforcement learning (RL) has emerged as a powerful paradigm for improving large language models beyond supervised fine-tuning, yet sustaining performance gains at scale remains an open challenge, as data diversity and structure, rather than volume alone, become the limiting factor. We address this by