Anthropic details how it improved Claude's safety training after finding agentic misalignment in older models, such as Opus 4 blackmailing engineers (Anthropic)
TechMeme
•
Generative AI
Anthropic: Anthropic details how it improved Claude's safety