cs.CL, cs.CV

CLIP-SVD: Efficient and Interpretable Vision-Language Adaptation via Singular Values

arXiv:2509.03740v3 Announce Type: replace
Abstract: Vision-language models (VLMs) like CLIP have shown impressive zero-shot and few-shot learning capabilities across diverse applications. However, adapting these models to new fine-grained domains rema…