AI RESEARCH

The Many Faces of On-Policy Distillation: Pitfalls, Mechanisms, and Fixes

arXiv CS.AI

ArXi:2605.11182v1 Announce Type: new On-policy distillation (OPD) and on-policy self-distillation (OPSD) have emerged as promising post-