Value-Guided Iterative Refinement and the DIQ-H Benchmark for Evaluating VLM Robustness
arXiv:2512.03992v2 Announce Type: replace-cross
Abstract: Vision-Language Models (VLMs) are essential for embodied AI and safety-critical applications, such as robotics and autonomous systems. However, existing benchmarks primarily focus on static or …