Towards Reason-Informed Video Editing in Unified Models with Self-Reflective Learning

ArXi:2512.09924v3 Announce Type: replace Unified video models exhibit strong capabilities in understanding and generation, yet they struggle with reason-informed visual editing even when equipped with powerful internal vision-language models (VLMs). We attribute this gap to two factors: (1) existing datasets are inadequate for