SODA: Semi On-Policy Black-Box Distillation for Large Language Models
arXiv:2604.03873v3 Announce Type: replace-cross
Abstract: Black-box knowledge distillation for large language models presents a strict trade-off. Simple off-policy methods (e.g., sequence-level knowledge distillation) struggle to correct the student’s…