AI RESEARCH
Understanding and Accelerating the Training of Masked Diffusion Language Models
arXiv CS.AI
•
ArXi:2605.13026v1 Announce Type: cross Masked diffusion models (MDMs) have emerged as a promising alternative to autoregressive models (ARMs) for language modeling. However, MDMs are known to learn substantially slowly than ARMs, which may become problematic when scaling MDMs to larger models. Therefore, we ask the following question: how can we accelerate standard MDM