AI RESEARCH
Model-Agnostic Lifelong LLM Safety via Externalized Attack-Defense Co-Evolution
arXiv CS.CL
•
ArXi:2605.13411v1 Announce Type: cross Large language models remain vulnerable to adversarial prompts that elicit harmful outputs. Existing safety paradigms typically couple red-teaming and post-