AI RESEARCH
Risk Awareness Injection: Calibrating Vision-Language Models for Safety without Compromising Utility
arXiv CS.AI
•
ArXi:2602.03402v3 Announce Type: replace Vision language models (VLMs) extend the reasoning capabilities of large language models (LLMs) to cross-modal settings, yet remain highly vulnerable to multimodal jailbreak attacks. Existing defenses predominantly rely on safety fine-tuning or aggressive token manipulations, incurring substantial