AI RESEARCH

BitDance: Scaling Autoregressive Generative Models with Binary Tokens

arXiv CS.AI

ArXi:2602.14041v2 Announce Type: replace-cross We present BitDance, a scalable autoregressive (AR) image generator that predicts binary visual tokens instead of codebook indices. With high-entropy binary latents, BitDance lets each token represent up to $2^{256}$ states, yielding a compact yet highly expressive discrete representation. Sampling from such a huge token space is difficult with standard classification. To resolve this, BitDance uses a binary diffusion head: instead of predicting an index with softmax, it employs continuous-space diffusion to generate the binary tokens.