The Geometric Wall: Manifold Structure Predicts Layerwise Sparse Autoencoder Scaling Laws

ArXi:2605.09887v1 Announce Type: cross Sparse autoencoders (SAEs) operationalise the linear representation hypothesis: they reconstruct model activations as sparse linear combinations of interpretable dictionary atoms, on the implicit assumption that activation space is well approximated by a globally linear structure. Their reconstruction error varies sharply across layers in ways that existing scaling laws, fitted at single layers, do not explain.