Near OOD Detection for Vision-Language Prompt Learning with Contrastive Logit Score

arXiv:2405.16091v2 Announce Type: replace Abstract: Prompt learning has emerged as an efficient and effective method for fine-tuning vision-language models such as CLIP. While many studies have explored generalisation abilities of these models in few-shot classification tasks and a few studies have addressed far out-of-distribution (OOD) of the models, their potential for addressing near OOD detection remains underexplored. Existing methods either require training from scratch, need fine-tuning, or are not designed for vision-language prompt learning. To address this, we introduce the Contrastive Logit Score (CLS), a novel post-hoc, plug-and-play scoring function. CLS significantly improves near OOD detection of pre-trained vision-language prompt learning methods without modifying their model architectures or requiring retraining. Our method achieves up to an 11.67% improvement in AUROC for near OOD detection with minimal computational overhead. Extensive evaluations validate the effectiveness, efficiency, and generalisability of our approach. Our code is available at https://github.com/davidmcjung/near-OOD-prompt-learning.

Leave a Comment