AI RESEARCH
Unified 4D World Action Modeling from Video Priors with Asynchronous Denoising
arXiv CS.AI
•
ArXi:2604.26694v1 Announce Type: cross We propose X-WAM, a Unified 4D World Model that unifies real-time robotic action execution and high-fidelity 4D world synthesis (video + 3D reconstruction) in a single framework, addressing the critical limitations of prior unified world models (e.g., UWM) that only model 2D pixel-space and fail to balance action efficiency and world modeling quality.