cs.AI

StoryTR: Narrative-Centric Video Temporal Retrieval with Theory of Mind Reasoning

arXiv:2604.23198v1 Announce Type: new
Abstract: Current video moment retrieval excels at action-centric tasks but struggles with narrative content. Models can see \textit{what is happening} but fail to reason \textit{why it matters}. This semantic gap…