Guobin Shen, Lei Huang, Xiang Cheng, Chenxiao Zhao, Jindong Li, Dongcheng Zhao, Xing Yu

From Generic Correlation to Input-Specific Credit in On-Policy Self Distillation

Guobin Shen, Lei Huang, Xiang Cheng, Chenxiao Zhao, Jindong Li, Dongcheng Zhao, Xing Yu / May 13, 2026

arXiv:2605.11613v1 Announce Type: new
Abstract: On-policy self-distillation has emerged as a promising paradigm for post-training language models, in which the model conditions on environment feedback to serve as its own teacher, providing dense token…

Author name: Guobin Shen, Lei Huang, Xiang Cheng, Chenxiao Zhao, Jindong Li, Dongcheng Zhao, Xing Yu

From Generic Correlation to Input-Specific Credit in On-Policy Self Distillation