Consistency diffusion language models: Up to 14x faster inference without sacrificing quality
Together AI Blog
•
Generative AI
Standard diffusion language models can't use KV caching and need too many refinement steps to be practical. CDLM fixes both with a post-