GridVAD: Open-Set Video Anomaly Detection via Spatial Reasoning over Stratified Frame Grids
arXiv:2603.25467v2 Announce Type: replace
Abstract: Vision-Language Models (VLMs) are powerful open-set reasoners, yet their direct use as anomaly detectors in video surveillance is fragile: without calibrated anomaly priors, they alternate between mi…