cs.AI, cs.CV

Multimodal Contextualized Support for Enhancing Video Retrieval System

arXiv:2412.07584v2 Announce Type: replace
Abstract: Current video retrieval systems, especially those used in competitions, primarily focus on querying individual keyframes or images rather than encoding an entire clip or video segment. However, queri…