AI RESEARCH

EgoDex: Learning Dexterous Manipulation from Large-Scale Egocentric Video

arXiv CS.LG

ArXi:2505.11709v3 Announce Type: replace-cross Imitation learning for manipulation has a well-known data scarcity problem. Unlike natural language and 2D computer vision, there is no Internet-scale corpus of data for dexterous manipulation. One appealing option is egocentric human video, a passively scalable data source. However, existing large-scale datasets such as Ego4D do not have native hand pose annotations and do not focus on object manipulation. To this end, we use Apple Vision Pro to collect EgoDex: the largest and most diverse dataset of dexterous human manipulation to date.