AI RESEARCH

Find, Fix, Reason: Context Repair for Video Reasoning

arXiv CS.CV

ArXi:2604.16243v1 Announce Type: new Reinforcement learning has advanced video reasoning in large multi-modal models, yet dominant pipelines either rely on on-policy self-exploration, which plateaus at the model's knowledge boundary, or hybrid replay that mixes policies and demands careful regularization. Dynamic context methods zoom into focused evidence but often require curated pre