Intrinsic Gradient Suppression for Label-Noise Prompt Tuning in Vision-Language Models
arXiv:2605.00591v1 Announce Type: new
Abstract: Contrastive vision-language models like CLIP exhibit remarkable zero-shot generalization. However, prompt tuning remains highly sensitive to label noise, as mislabeled samples generate disproportionately…