AI RESEARCH

MagicWorld: Towards Long-Horizon Stability for Interactive Video World Exploration

arXiv CS.CV

ArXi:2511.18886v2 Announce Type: replace Recent interactive video world model methods generate scene evolution conditioned on user instructions. Although they achieve impressive results, two key limitations remain. First, they exhibit motion drift in complex environments with multiple interacting subjects, where dynamic subjects fail to follow realistic motion patterns during scene evolution. Second, they suffer from error accumulation in long-horizon interactions, where autoregressive generation gradually drifts from earlier scene states and causes structural and semantic inconsistencies.