Rethinking Layer Relevance in Large Language Models Beyond Cosine Similarity

ArXi:2605.14075v1 Announce Type: new Large language models (LLMs) have revolutionized natural language processing. Understanding their internal mechanisms is crucial for developing interpretable and optimized architectures. Mechanistic interpretability has led to the development of various methods for assessing layer relevance, with cosine similarity being a widely used tool in the field. On this work, we nstrate that cosine similarity is a poor proxy for the actual performance degradation caused by layer removal.