OOD-MMSafe: Advancing MLLM Safety from Harmful Intent to Hidden Consequences

ArXi:2603.09706v1 Announce Type: new While safety alignment for Multimodal Large Language Models (MLLMs) has gained significant attention, current paradigms primarily target malicious intent or situational violations. We propose shifting the safety frontier toward consequence-driven safety, a paradigm essential for the robust deployment of autonomous and embodied agents. To formalize this shift, we