Understanding Performance Collapse in Layer-Pruned Large Language Models via Decision Representation Transitions

ArXi:2605.07271v1 Announce Type: cross Layer pruning efficiently reduces Large Language Model (LLM) computational costs but often triggers sudden performance collapse. Existing representation-based analyses struggle to explain this mechanism. We propose studying pruning through decision representation. Focusing on multiple-choice tasks, we