AI RESEARCH

Infinite Mask Diffusion for Few-Step Distillation

arXiv CS.AI

ArXi:2605.10518v1 Announce Type: cross Masked Diffusion Models (MDMs) have emerged as a promising alternative to autoregressive models in language modeling, offering the advantages of parallel decoding and bidirectional context processing within a simple yet effective framework. Specifically, their explicit distinction between masked tokens and data underlies their simple framework and effective conditional generation. However, MDMs typically require many sampling iterations due to factorization errors stemming from simultaneous token updates.