AI RESEARCH

Multi-Modal Manipulation via Multi-Modal Policy Consensus

arXiv CS.AI

ArXi:2509.23468v3 Announce Type: replace-cross Effectively integrating diverse sensory modalities is crucial for robotic manipulation. However, the typical approach of feature concatenation is often suboptimal: dominant modalities such as vision can overwhelm sparse but critical signals like touch in contact-rich tasks, and monolithic architectures cannot flexibly incorporate new or missing modalities without re