AI RESEARCH
QAPruner: Quantization-Aware Vision Token Pruning for Multimodal Large Language Models
arXiv CS.AI
•
ArXi:2604.02816v1 Announce Type: cross Multimodal Large Language Models (MLLMs) have shown strong reasoning ability, but their high computational and memory costs hinder deployment in resource-constrained settings. While Post-