AI RESEARCH
Mitigating Many-Shot Jailbreaking
arXiv CS.AI
•
ArXi:2504.09604v3 Announce Type: cross Many-shot jailbreaking (MSJ) is an adversarial technique that exploits the long context windows of modern LLMs to circumvent model safety