AI RESEARCH

Break Me If You Can: Self-Jailbreaking of Aligned LLMs via Lexical Insertion Prompting

arXiv CS.CL

ArXi:2601.02670v2 Announce Type: replace