AI RESEARCH

Multi-Token Residual Prediction

arXiv CS.LG

ArXi:2605.18817v1 Announce Type: new Diffusion Language Models (DLMs) generate text by iteratively denoising masked token sequences, offering a tradeoff between parallelism and quality compared to autoregressive models. In current practice, the number of tokens decoded per step is controlled by a confidence threshold, and quality degrades monotonically as tokens are denoised per step. We