AI RESEARCH
SoLA: Leveraging Soft Activation Sparsity and Low-Rank Decomposition for Large Language Model Compression
arXiv CS.AI
•
ArXi:2604.03258v1 Announce Type: cross Large language models (LLMs) have nstrated impressive capabilities across various tasks, but the billion-scale parameters pose deployment challenges. Although existing methods attempt to reduce the scale of LLMs, they require either special hardware or expensive post-