AI RESEARCH

Mitigating Premature Discretization with Progressive Quantization for Robust Vector Tokenization

arXiv CS.LG

ArXi:2603.22304v1 Announce Type: new Vector Quantization (VQ) has become the cornerstone of tokenization for many multimodal Large Language Models and diffusion synthesis. However, existing VQ paradigms suffer from a fundamental conflict: they enforce discretization before the encoder has captured the underlying data manifold. We term this phenomenon Premature Discretization. To resolve this, we propose Progressive Quantization (ProVQ), which incorporates the dynamics of quantization hardness as a fundamental yet previously overlooked axis in VQ