ZINA: Multimodal Fine-grained Hallucination Detection and Editing
arXiv:2506.13130v2 Announce Type: replace
Abstract: Multimodal Large Language Models (MLLMs) often generate hallucinations, where the output deviates from the visual content. Given that these hallucinations can take diverse forms, detecting hallucinat…