AI RESEARCH
Roll Out and Roll Back: Diffusion LLMs are Their Own Efficiency Teachers
arXiv CS.CL
•
ArXi:2605.16941v1 Announce Type: new Diffusion Large Language Models (DLLMs) promise fast parallel generation, yet open-source DLLMs still face a severe quality-speed trade-off: accelerating decoding by revealing multiple tokens often causes substantial quality degradation. We attribute this dilemma to a train-inference mismatch amplified by irreversible decoding. While