AI RESEARCH
Rethinking On-Policy Distillation of Large Language Models: Phenomenology, Mechanism, and Recipe
arXiv CS.AI
•
ArXi:2604.13016v1 Announce Type: cross On-policy distillation (OPD) has become a core technique in the post-