Video-CoE: Reinforcing Video Event Prediction via Chain of Events

ArXi:2603.14935v1 Announce Type: new Despite advances in the application of MLLMs for various video tasks, video event prediction (VEP) remains relatively underexplored. VEP requires the model to perform fine-grained temporal modeling of videos and establish logical relationships between videos and future events, which current MLLMs still struggle with.