ClipTTT: CLIP-Guided Test-Time Training Helps LVLMs See Better
arXiv:2603.26486v1 Announce Type: new
Abstract: Large vision-language models (LVLMs) tend to hallucinate, especially when visual inputs are corrupted at test time. We show that such corruptions act as additional distribution shifts, significantly ampl…