AI RESEARCH

LRCP: Low-Rank Compressibility Guided Visual Token Pruning for Efficient LVLMs

arXiv CS.CV

ArXi:2605.15621v1 Announce Type: new Large vision-language models (LVLMs) achieve strong multimodal understanding, but their inference cost grows rapidly with the number of visual tokens, especially for high-resolution images and long videos. Existing attention-based methods estimate token importance from attention scores, which may