AI RESEARCH

AutoCompress: Critical Layer Isolation for Efficient Transformer Compression

arXiv CS.LG

ArXi:2604.22786v1 Announce Type: new We present AutoCompress, a transformer compression method motivated by an empirical finding: in small transformers, Layer 0 carries disproportionately high task-critical information, with an NTK-based importance score of 3.6 compared to a maximum of 0.054 for all other layers -- a gap of over 60x. Based on this finding, we propose Critical Layer Isolation (CLI), an architecture that protects Layer 0 at full dimensionality, compresses all intermediate layers through a learned bottleneck, and res the full dimension at the final layer.