RTPrune: Reading-Twice Inspired Token Pruning for Efficient DeepSeek-OCR Inference

ArXi:2605.00392v1 Announce Type: cross DeepSeek-OCR leverages visual-text compression to reduce long-text processing costs and accelerate inference, yet visual tokens remain prone to redundant textual and structural information. Moreover, current token pruning methods for conventional vision-language models (VLMs) fail to preserve textual fidelity due to improper compression mechanisms.