AI RESEARCH
Automating Deception: Scalable Multi-Turn LLM Jailbreaks
arXiv CS.LG
•
ArXi:2511.19517v2 Announce Type: replace Multi-turn conversational attacks, which leverage psychological principles like Foot-in-the-Door (FITD), where a small initial request paves the way for a significant one, to bypass safety alignments, pose a persistent threat to Large Language Models (LLMs). Progress in defending against these attacks is hindered by a reliance on manual, hard-to-scale dataset creation. This paper