Compressing Transformer Language Models via Matrix Product Operator Decomposition: A Case Study on PicoGPT

ArXi:2603.28534v1 Announce Type: new Transformer-based language models achieve strong performance across NLP tasks, but their quadratic parameter scaling with hidden dimension makes deployment on resource-constrained hardware expensive. We study Matrix Product Operator (MPO) decomposition as a principled compression method for transformers. MPO factorises weight matrices into chains of low-rank cores, with approximation quality controlled by the bond dimension chi. We replace every nn.