AI RESEARCH

E0: Enhancing Generalization and Fine-Grained Control in VLA Models via Tweedie Discrete Diffusion

arXiv CS.LG

ArXi:2511.21542v2 Announce Type: replace-cross Vision-Language-Action (VLA) models offer a unified framework for robotic manipulation by integrating visual perception, language understanding, and control generation. However, existing VLA systems still struggle to generalize across diverse tasks, scenes, and camera viewpoints, and often produce coarse or unstable actions.