AI RESEARCH

Retrieving Any Relevant Moments: Benchmark and Models for Generalized Moment Retrieval

arXiv CS.CV

ArXi:2605.02623v1 Announce Type: new Video Moment Retrieval (VMR) aims to localize temporal segments in videos that correspond to a natural language query, but typically assumes only a single matching moment for each query. This assumption does not always hold in real-world scenarios, where queries may correspond to multiple or no moments. Thus, we formulate Generalized Moment Retrieval (GMR), a unified setting that requires retrieving the complete set of relevant moments or predicting an empty set. To enable systematic study of GMR, we.