AI RESEARCH

Em-Garde: A Propose-Match Framework for Proactive Streaming Video Understanding

arXiv CS.AI

ArXi:2603.19054v1 Announce Type: cross Recent advances in Streaming Video Understanding has enabled a new interaction paradigm where models respond proactively to user queries. Current proactive VideoLLMs rely on per-frame triggering decision making, which suffers from an efficiency-accuracy dilemma. We propose Em-Garde, a novel framework that decouples semantic understanding from streaming perception.