AI RESEARCH

Self-Distillation Zero: Self-Revision Turns Binary Rewards into Dense Supervision

arXiv CS.CL

ArXi:2604.12002v1 Announce Type: new Current post-