AI RESEARCH
SortedRL: Accelerating RL Training for LLMs through Online Length-Aware Scheduling
arXiv CS.AI
•
ArXi:2603.23414v1 Announce Type: cross Scaling reinforcement learning (RL) has shown strong promise for enhancing the reasoning abilities of large language models (LLMs), particularly in tasks requiring long chain-of-thought generation. However, RL