AI RESEARCH

D$^2$Evo: Dual Difficulty-Aware Self-Evolution for Data-Efficient Reinforcement Learning

arXiv CS.LG

ArXi:2605.17037v1 Announce Type: new Reinforcement learning (RL) has nstrated potential for enhancing reasoning in large language models (LLMs). However, effective RL