AI RESEARCH

DUAL-Bench: Measuring Over-Refusal and Robustness in Vision-Language Models

arXiv CS.CL

ArXi:2510.10846v3 Announce Type: replace As vision-language models (VLMs) become increasingly capable, maintaining a balance between safety and usefulness remains a central challenge. Safety mechanisms, while essential, can backfire, causing over-refusal, where models decline benign requests out of excessive caution. Yet, there is currently a significant lack of benchmarks that have systematically addressed over-refusal in the visual modality. This setting