AI SAFETY & ETHICS

Anthropic repeatedly accidentally trained against the CoT, demonstrating inadequate processes

Alignment Forum • April 14, 2026

It turns out that Anthropic accidentally trained against the chain of thought of Claude Mythos Preview in around 8% of