Semantic-Aware Adaptive Visual Memory for Streaming Video Understanding
arXiv:2605.07897v1 Announce Type: cross
Abstract: Online streaming video understanding requires models to process continuous visual inputs and respond to user queries in real time, where the unbounded stream and unpredictable query timing turn memory …