cs.CV

HieraMamba: Video Temporal Grounding via Hierarchical Anchor-Mamba Pooling

arXiv:2510.23043v2 Announce Type: replace
Abstract: Video temporal grounding, the task of localizing the start and end times of a natural language query in untrimmed video, requires capturing both global context and fine-grained temporal detail. This …