OmniVTG: A Large-Scale Dataset and Training Paradigm for Open-World Video Temporal Grounding

ArXi:2604.25276v1 Announce Type: new Video Temporal Grounding (VTG), the task of localizing video segments from text queries, struggles in open-world settings due to limited dataset scale and semantic diversity, causing performance gaps between common and rare concepts. To overcome these limitations, we