FD-VLA: Force-Distilled Vision-Language-Action Model for Contact-Rich Manipulation

ArXi:2602.02142v2 Announce Type: replace-cross Force sensing is a crucial modality for Vision-Language-Action (VLA) frameworks, as it enables fine-grained perception and dexterous manipulation in contact-rich tasks. We present Force-Distilled VLA (FD-VLA), a novel framework that integrates force awareness into contact-rich manipulation without relying on physical force sensors.