Compressing Transformer Language Models via Matrix Product Operator Decomposition: A Case Study on PicoGPT
arXiv:2603.28534v1 Announce Type: new
Abstract: Transformer-based language models achieve strong performance across NLP tasks, but their quadratic parameter scaling with hidden dimension makes deployment on resource-constrained hardware expensive. We …