Toward Training Superintelligent Software Agents through Self-Play SWE-RL

ArXi:2512.18552v2 Announce Type: replace-cross While current software agents powered by large language models (LLMs) and agentic reinforcement learning (RL) can boost programmer productivity, their