AI RESEARCH

Rethinking On-Policy Distillation of Large Language Models: Phenomenology, Mechanism, and Recipe

arXiv CS.AI

ArXi:2604.13016v1 Announce Type: cross On-policy distillation (OPD) has become a core technique in the post-