CLIP-RD: Relational Distillation for Efficient CLIP Knowledge Distillation
arXiv:2603.25383v1 Announce Type: new
Abstract: CLIP aligns image and text embeddings via contrastive learning and demonstrates strong zero-shot generalization. Its large-scale architecture requires substantial computational and memory resources, moti…