AI RESEARCH
MovieRecapsQA: A Multimodal Open-Ended Video Question-Answering Benchmark
arXiv CS.CV
•
ArXi:2601.02536v2 Announce Type: replace Understanding real-world videos such as movies requires integrating visual and dialogue cues. Yet existing VideoQA benchmarks struggle to capture this multimodal reasoning and, given the difficulty of evaluating free-form answers, largely resort to simple multiple choice questions. We