SPARK: Jailbreaking T2V Models by Synergistically Prompting Auditory and Recontextualized Knowledge

ArXi:2511.13127v3 Announce Type: replace Jailbreak attacks can circumvent model safety guardrails and reveal critical blind spots. Prior attacks on text-to-video (T2V) models typically add adversarial perturbations to obviously unsafe prompts, which are often easy to detect and defend. In contrast, we show that benign-looking prompts containing rich, implicit cues can induce T2V models to generate semantically unsafe videos that both violate policy and preserve the original (blocked) intent.