SigLino: Efficient Multi-Teacher Distillation for Agglomerative Vision Foundation Models

ArXi:2512.20157v2 Announce Type: replace Vision foundation models trained via multi-teacher distillation offer a promising path toward unified visual representations, yet the learning dynamics and data efficiency of such approaches remain underexplored. In this paper, we systematically study multi-teacher distillation for vision foundation models and identify key factors that enable