cs.CV

RCoT-Seg: Reinforced Chain-of-Thought for Video Reasoning and Segmentation

arXiv:2605.07334v1 Announce Type: new
Abstract: Video Reasoning Segmentation (VRS) aims to segment target objects in videos based on implicit instructions that convey human intent and temporal logic. Existing MLLM-based methods predict masks with a [S…