AI RESEARCH

Why Do Safety Guardrails Degrade Across Languages?

arXiv CS.LG

ArXi:2605.17173v1 Announce Type: cross Large language models exhibit safety degradation in non-English languages. Standard evaluation relies on Jailbreak Success Rate (JSR), which confounds several safety-driving factors into one, obscuring the specific cause(s) of safety failure. We