AI RESEARCH

MotionCrafter: Dense Geometry and Motion Reconstruction with a 4D VAE

arXiv CS.AI

ArXi:2602.08961v2 Announce Type: replace-cross We present MotionCrafter, a framework that leverages video generators to jointly reconstruct 4D geometry and estimate dense motion from a monocular video. The key idea is a joint representation of dense 3D point maps and 3D scene flows in a shared coordinate system, together with a 4D VAE tailored to learn this representation effectively. Unlike prior work that strictly aligns 3D values and latents with RGB VAE latents-despite their fundamentally different distributions-we show that such alignment is unnecessary and can hurt performance.