cs.CL

Distribution Corrected Offline Data Distillation for Large Language Models

arXiv:2605.14071v1 Announce Type: new
Abstract: Distilling reasoning traces from strong large language models into smaller ones is a promising route to improve intelligence in resource-constrained settings. Existing approaches face a fundamental trade…