Hybrid Policy Distillation for LLMs
arXiv:2604.20244v1 Announce Type: new
Abstract: Knowledge distillation (KD) is a powerful paradigm for compressing large language models (LLMs), whose effectiveness depends on intertwined choices of divergence direction, optimization strategy, and dat…