HieraMamba: Video Temporal Grounding via Hierarchical Anchor-Mamba Pooling
arXiv:2510.23043v2 Announce Type: replace
Abstract: Video temporal grounding, the task of localizing the start and end times of a natural language query in untrimmed video, requires capturing both global context and fine-grained temporal detail. This …