How Pruning Reshapes Features: Sparse Autoencoder Analysis of Weight-Pruned Language Models

ArXi:2603.25325v1 Announce Type: cross Weight pruning is a standard technique for compressing large language models, yet its effect on learned internal representations remains poorly understood. We present the first systematic study of how unstructured pruning reshapes the feature geometry of language models, using Sparse Autoencoders (SAEs) as interpretability probes.