AI RESEARCH

Beyond Variance: Prompt-Efficient RLVR via Rare-Event Amplification and Bidirectional Pairing

arXiv CS.AI

ArXi:2602.03452v2 Announce Type: replace-cross Reinforcement learning with verifiable rewards (RLVR) is effective for