Your LLM Ablation Study Is Lying to You. Here Is the Proof.

Towards AI
Generative AI AI Safety AI Research

Three findings from mechanistic interpretability that break everything you thought you knew about how transformers work.