RCP: Representation Consistency Pruner for Mitigating Distribution Shift in Large Vision-Language Models

ArXi:2604.04972v1 Announce Type: new Large Vision-Language Models (LVLMs) suffer from prohibitive inference costs due to the massive number of visual tokens processed by the language decoder. Existing pruning methods often lead to significant performance degradation because the irreversible removal of visual tokens causes a distribution shift in the hidden states that deviates from the pre-trained full-token regime.