AI RESEARCH
On the Generalization of Knowledge Distillation: An Information-Theoretic View
arXiv CS.LG
•
ArXi:2605.13143v1 Announce Type: cross Knowledge distillation is widely used to improve generalization in practice, yet its theoretical understanding remains elusive. In the standard distillation setting, a teacher model provides soft predictions to guide the