AI RESEARCH
Video-OPD: Efficient Post-Training of Multimodal Large Language Models for Temporal Video Grounding via On-Policy Distillation
arXiv CS.CV
•
ArXi:2602.02994v2 Announce Type: replace Reinforcement learning has emerged as a principled post-