LensWalk: Agentic Video Understanding by Planning How You See in Videos
arXiv:2603.24558v1 Announce Type: cross
Abstract: The dense, temporal nature of video presents a profound challenge for automated analysis. Despite the use of powerful Vision-Language Models, prevailing methods for video understanding are limited by t…