AI RESEARCH

Video-OPD: Efficient Post-Training of Multimodal Large Language Models for Temporal Video Grounding via On-Policy Distillation

arXiv CS.CV

ArXi:2602.02994v2 Announce Type: replace Reinforcement learning has emerged as a principled post-