When Models Judge Themselves: Unsupervised Self-Evolution for Multimodal Reasoning

ArXi:2603.21289v1 Announce Type: cross Recent progress in multimodal large language models has led to strong performance on reasoning tasks, but these improvements largely rely on high-quality annotated data or teacher-model distillation, both of which are costly and difficult to scale. To address this, we propose an unsupervised self-evolution