AI RESEARCH

BitLM: Unlocking Multi-Token Language Generation with Bitwise Continuous Diffusion

arXiv CS.CL

ArXi:2605.11577v1 Announce Type: new Autoregressive language models generate text one token at a time, yet natural language is inherently structured in multi-token units, including phrases, n-grams, and collocations that carry meaning jointly. This one-token bottleneck limits both the expressiveness of the model during pre-