AI RESEARCH

Mitigating Many-Shot Jailbreaking

arXiv CS.AI

ArXi:2504.09604v3 Announce Type: cross Many-shot jailbreaking (MSJ) is an adversarial technique that exploits the long context windows of modern LLMs to circumvent model safety