BicKD: Bilateral Contrastive Knowledge Distillation
arXiv:2602.01265v2 Announce Type: replace
Abstract: Knowledge distillation (KD) is a machine learning framework that transfers knowledge from a teacher model to a student model. The vanilla KD proposed by Hinton et al. has been the dominant approach i…