EnsemHalDet: Robust VLM Hallucination Detection via Ensemble of Internal State Detectors
arXiv:2604.02784v1 Announce Type: new
Abstract: Vision-Language Models (VLMs) excel at multimodal tasks, but they remain vulnerable to hallucinations that are factually incorrect or ungrounded in the input image. Recent work suggests that hallucinatio…