DMN: A Compositional Framework for Jailbreaking Multimodal LLMs with Multi-Image Inputs

ArXi:2605.18915v1 Announce Type: cross Multimodal Large Language Models (MLLMs) are vulnerable to jailbreak attacks, which can elicit harmful responses from MLLMs. Many MLLMs multi-image inputs, inadvertently