AI RESEARCH
Layer Equivalence Is Not a Property of Layers Alone: How You Test Redundancy Changes What You Find
arXiv CS.AI
•
ArXi:2605.16234v1 Announce Type: cross When researchers ask whether two transformer layers are "equivalent" for compression, they often conflate distinct tests. Replacement asks whether one layer's map can substitute for another's in place; interchange asks whether two layers approximately commute when their positions are swapped. Both are output-grounded swap-KL probes, but they need not agree: on pretrained transformers the protocol gap can change which layers look safe to prune by several-fold under the same evaluator, especially when replacement distances are high.