AI RESEARCH
SoK: Robustness in Large Language Models against Jailbreak Attacks
arXiv CS.AI
•
ArXi:2605.05058v1 Announce Type: cross Large Language Models (LLMs) have achieved remarkable success but remain highly susceptible to jailbreak attacks, in which adversarial prompts coerce models into generating harmful, unethical, or policy-violating outputs. Such attacks pose real-world risks, eroding safety, trust, and regulatory compliance in high-stakes applications.