Consistency diffusion language models: Up to 14x faster inference without sacrificing quality

Together AI Blog
Generative AI

Standard diffusion language models can't use KV caching and need too many refinement steps to be practical. CDLM fixes both with a post-