RQ-MoE: Residual Quantization via Mixture of Experts for Efficient Input-Dependent Vector Compression

ArXi:2605.14359v1 Announce Type: new Vector quantization is a fundamental tool for compressing high-dimensional embeddings, yet existing multi-codebook methods rely on static codebooks that limit expressiveness under heterogeneous data geometry. While recent dynamic quantizers like QINCo adapt codebooks to individual inputs and improve expressiveness, their strict sequential dependencies create decoding bottlenecks.