Learning to Reason with Curriculum I: Provable Benefits of Autocurriculum

ArXi:2603.18325v1 Announce Type: new Chain-of-thought reasoning, where language models expend additional computation by producing thinking tokens prior to final responses, has driven significant advances in model capabilities. However