LIBERO-Para: A Diagnostic Benchmark and Metrics for Paraphrase Robustness in VLA Models

ArXi:2603.28301v1 Announce Type: new Vision-Language-Action (VLA) models achieve strong performance in robotic manipulation by leveraging pre-trained vision-language backbones. However, in downstream robotic settings, they are typically fine-tuned with limited data, leading to overfitting to specific instruction formulations and leaving robustness to paraphrased instructions underexplored. To study this gap, we