AI RESEARCH
ScaleMoGen: Autoregressive Next-Scale Prediction for Human Motion Generation
arXiv CS.CV
•
ArXi:2605.11704v1 Announce Type: new We present ScaleMoGen, a scale-wise autoregressive framework for text-driven human motion generation. Unlike conventional autoregressive approaches that rely on standard next-token prediction, ScaleMoGen frames motion generation as a coarse-to-fine process. We quantize 3D motions into compositional discrete tokens across multiple skeletal-emporal scales of increasing granularity, learning to generate motion by autoregressively predicting next-scale token maps.