AI RESEARCH

CARI4D: Category Agnostic 4D Reconstruction of Human-Object Interaction

arXiv CS.CV

ArXi:2512.11988v3 Announce Type: replace Accurate capture of human-object interaction from ubiquitous sensors like RGB cameras is important for applications in human understanding, gaming, and robot learning. However, inferring 4D interactions from a single RGB view is highly challenging due to the unknown object and human information, depth ambiguity, occlusion, and complex motion, which hinder consistent 3D and temporal reconstruction. Previous methods simplify the setup by assuming ground truth object template or cons