AI RESEARCH
Mask Is What DLLM Needs: A Masked Data Training Paradigm for Diffusion LLMs
arXiv CS.LG
•
ArXi:2603.15803v1 Announce Type: new Discrete diffusion models offer global context awareness and flexible parallel generation. However, uniform random noise schedulers in standard DLLM