AI RESEARCH
Self-Distillation Zero: Self-Revision Turns Binary Rewards into Dense Supervision
arXiv CS.CL
•
ArXi:2604.12002v1 Announce Type: new Current post-