Mohammed Sabry, Anya Belz

Budgeted LoRA: Distillation as Structured Compute Allocation for Efficient Inference

Mohammed Sabry, Anya Belz / May 7, 2026

arXiv:2605.04341v1 Announce Type: new
Abstract: We study distillation for large language models under explicit compute constraints, with the goal of producing student models that are not only cheaper to train, but structurally efficient at inference t…

Author name: Mohammed Sabry, Anya Belz

Budgeted LoRA: Distillation as Structured Compute Allocation for Efficient Inference