Highly Efficient and Effective LLMs with Multi-Boolean Architectures
arXiv:2505.22811v5 Announce Type: replace-cross
Abstract: Weight binarization has emerged as a promising strategy to reduce the complexity of large language models (LLMs). Existing approaches fall into post-training binarization, which is simple but c…