cs.AI, cs.CL

Hybrid Policy Distillation for LLMs

arXiv:2604.20244v1 Announce Type: new
Abstract: Knowledge distillation (KD) is a powerful paradigm for compressing large language models (LLMs), whose effectiveness depends on intertwined choices of divergence direction, optimization strategy, and dat…