AI RESEARCH
Revisiting On-Policy Distillation: Empirical Failure Modes and Simple Fixes
arXiv CS.AI
•
ArXi:2603.25562v1 Announce Type: cross On-policy distillation (OPD) is appealing for large language model (LLM) post-