Chinthani Sugandhika, Chen Li, Deepu Rajan, Basura Fernando

Know-Show: Benchmarking Video-Language Models on Spatio-Temporal Grounded Reasoning

Chinthani Sugandhika, Chen Li, Deepu Rajan, Basura Fernando / April 1, 2026

arXiv:2512.05513v3 Announce Type: replace
Abstract: Large Video-Language Models (Video-LMs) have achieved impressive progress in multimodal understanding, yet their reasoning remains weakly grounded in space and time. We present Know-Show, a new bench…

Author name: Chinthani Sugandhika, Chen Li, Deepu Rajan, Basura Fernando

Know-Show: Benchmarking Video-Language Models on Spatio-Temporal Grounded Reasoning