AdaSpark: Adaptive Sparsity for Efficient Long-Video Understanding

ArXi:2604.08077v1 Announce Type: new Processing long-form videos with Video Large Language Models (Video-LLMs) is computationally prohibitive. Current efficiency methods often compromise fine-grained perception through irreversible information disposal or inhibit long-range temporal modeling via rigid, predefined sparse patterns. This paper