cs.CV

Video-ToC: Video Tree-of-Cue Reasoning

arXiv:2604.20473v1 Announce Type: new
Abstract: Existing Video Large Language Models (Video LLMs) struggle with complex video understanding, exhibiting limited reasoning capabilities and potential hallucinations. In particular, these methods tend to p…