Two-Stage Regularization-Based Structured Pruning for LLMs

ArXi:2505.18232v3 Announce Type: replace-cross The deployment of large language models (LLMs) is largely hindered by their large number of parameters. Structural pruning has emerged as a promising solution. Prior structured pruning methods directly remove unimportant parameters based on certain metrics, which often causes knowledge loss and necessitates extensive re