AI RESEARCH
Rebellious Student: Reversing Teacher Signals for Reasoning Exploration with Self-Distilled RLVR
arXiv CS.LG
•
ArXi:2605.10781v1 Announce Type: new Self-distillation has emerged as a powerful framework for post-