AI RESEARCH
Sink-Token-Aware Pruning for Fine-Grained Video Understanding in Efficient Video LLMs
arXiv CS.LG
•
ArXi:2604.20937v1 Announce Type: new Video Large Language Models (Video LLMs) incur high inference latency due to a large number of visual tokens provided to LLMs. To address this