Younes Hourri, Mohammad Mozaffari, Maryam Mehri Dehnavi

PATCH: Learnable Tile-level Hybrid Sparsity for LLMs

Younes Hourri, Mohammad Mozaffari, Maryam Mehri Dehnavi / April 30, 2026

arXiv:2509.23410v4 Announce Type: replace-cross
Abstract: Large language models (LLMs) deliver impressive performance but incur prohibitive memory and compute costs at deployment. Model pruning is an effective way to reduce these overheads, yet existi…

Author name: Younes Hourri, Mohammad Mozaffari, Maryam Mehri Dehnavi

PATCH: Learnable Tile-level Hybrid Sparsity for LLMs