IWP: Token Pruning as Implicit Weight Pruning in Large Vision Language Models

ArXi:2604.00757v1 Announce Type: cross Large Vision Language Models show impressive performance across image and video understanding tasks, yet their computational cost grows rapidly with the number of visual tokens. Existing token pruning methods mitigate this issue through empirical approaches while overlooking the internal mechanism of attention. In this paper, we propose a novel