AI RESEARCH

Rebellious Student: Reversing Teacher Signals for Reasoning Exploration with Self-Distilled RLVR

arXiv CS.LG

ArXi:2605.10781v1 Announce Type: new Self-distillation has emerged as a powerful framework for post-