AI RESEARCH
From Mechanistic to Compositional Interpretability
arXiv CS.LG
•
ArXi:2605.08934v1 Announce Type: new Mechanistic interpretability aims to explain neural model behaviour by reverse-engineering learned computational structure into human-understandable components. Without a formal framework, however, mechanistic explanations cannot be objectively verified, compared, or composed. We