AI RESEARCH
Pixel Motion Diffusion is What We Need for Robot Control
arXiv CS.CV
•
ArXi:2509.22652v2 Announce Type: replace-cross We present DAWN (Diffusion is All We Need for robot control), a unified diffusion-based framework for language-conditioned robotic manipulation that bridges high-level motion intent and low-level robot action via structured pixel motion representation. In DAWN, both the high-level and low-level controllers are modeled as diffusion processes, yielding a fully trainable, end-to-end system with interpretable intermediate motion abstractions.