AI RESEARCH

CATS: Curvature Aware Temporal Selection for efficient long video understanding

arXiv CS.CV

ArXi:2605.09223v1 Announce Type: new Understanding long videos with multimodal large language models (MLLMs) requires selecting a small subset of informative frames under strict computational budgets, where exhaustive processing is infeasible and optimal selection is combinatorial. We propose CATS, a curvature-aware frame selection method that explicitly models the temporal geometry of query-frame relevance to identify salient events and their surrounding context.