AI RESEARCH

Rethinking Jailbreak Detection of Large Vision Language Models with Representational Contrastive Scoring

arXiv CS.LG

ArXi:2512.12069v3 Announce Type: replace-cross Large Vision-Language Models (LVLMs) are vulnerable to a growing array of multimodal jailbreak attacks, necessitating defenses that are both generalizable to novel threats and efficient for practical deployment. Many current strategies fall short, either targeting specific attack patterns, which limits generalization, or imposing high computational overhead.