AI RESEARCH

MVP-LAM: Learning Action-Centric Latent Action via Cross-Viewpoint Reconstruction

arXiv CS.CV

ArXi:2602.03668v2 Announce Type: replace-cross Latent actions learned from diverse human videos serve as pseudo-labels for vision-language-action (VLA) pre