AI RESEARCH
AutoCompress: Critical Layer Isolation for Efficient Transformer Compression
arXiv CS.LG
•
ArXi:2604.22786v1 Announce Type: new We present AutoCompress, a transformer compression method motivated by an empirical finding: in small transformers, Layer 0 carries disproportionately high task-critical information, with an NTK-based importance score of 3.6 compared to a maximum of 0.054 for all other layers -- a gap of over 60x. Based on this finding, we propose Critical Layer Isolation (CLI), an architecture that protects Layer 0 at full dimensionality, compresses all intermediate layers through a learned bottleneck, and res the full dimension at the final layer.