cs.CV

LiveVLM: Efficient Online Video Understanding via Streaming-Oriented KV Cache and Retrieval

arXiv:2505.15269v2 Announce Type: replace
Abstract: Recent developments in Video Large Language Models (Video LLMs) have enabled models to process hour-long videos and exhibit exceptional performance. Nonetheless, the Key-Value (KV) cache expands line…