Continual Distillation of Teachers from Different Domains
arXiv:2605.04059v1 Announce Type: cross
Abstract: Deep learning models continue to scale, with some requiring more storage than many large-scale datasets. Thus, we introduce a new paradigm: Continual Distillation (CD), where a student learns sequentia…