AI RESEARCH

SoK: Robustness in Large Language Models against Jailbreak Attacks

arXiv CS.AI

ArXi:2605.05058v1 Announce Type: cross Large Language Models (LLMs) have achieved remarkable success but remain highly susceptible to jailbreak attacks, in which adversarial prompts coerce models into generating harmful, unethical, or policy-violating outputs. Such attacks pose real-world risks, eroding safety, trust, and regulatory compliance in high-stakes applications.