REAM: Merging Improves Pruning of Experts in LLMs
arXiv:2604.04356v1 Announce Type: cross
Abstract: Mixture-of-Experts (MoE) large language models (LLMs) are among the top-performing architectures. The largest models, often with hundreds of billions of parameters, pose significant memory challenges f…