cs.AI, cs.CL, cs.LG

Revisiting the Capacity Gap in Chain-of-Thought Distillation from a Practical Perspective

arXiv:2604.08880v1 Announce Type: new
Abstract: Chain-of-thought (CoT) distillation transfers reasoning behaviors from a strong teacher to a smaller student, but prior work reports a capacity gap: distillation may fail when the teacher-student capabil…