cs.AI, cs.CV

STORM: End-to-End Referring Multi-Object Tracking in Videos

arXiv:2604.10527v1 Announce Type: new
Abstract: Referring multi-object tracking (RMOT) is a task of associating all the objects in a video that semantically match with given textual queries or referring expressions. Existing RMOT approaches decompose …