AI RESEARCH
Disentangled Robot Learning via Separate Forward and Inverse Dynamics Pretraining
arXiv CS.CV
•
ArXi:2604.16391v1 Announce Type: cross Vision-language-action (VLA) models have shown great potential in building generalist robots, but still face a dilemma-misalignment of 2D image forecasting and 3D action prediction. Besides, such a vision-action entangled