cs.CV, cs.LG

Ramen: Robust Test-Time Adaptation of Vision-Language Models with Active Sample Selection

arXiv:2604.21728v1 Announce Type: new
Abstract: Pretrained vision-language models such as CLIP exhibit strong zero-shot generalization but remain sensitive to distribution shifts. Test-time adaptation adapts models during inference without access to s…