cs.AI, cs.LG

On the Limits of Layer Pruning for Generative Reasoning in Large Language Models

arXiv:2602.01997v2 Announce Type: replace
Abstract: Recent work has shown that layer pruning can effectively compress large language models (LLMs) while retaining strong performance on classification benchmarks, often with little or no finetuning. In …