cs.CV

TIE: Time Interval Encoding for Video Generation over Events

arXiv:2605.10543v1 Announce Type: new
Abstract: Director-style prompting, robotic action prediction, and interactive video agents demand temporal grounding over concurrent events — a regime in which 68% of general clips and over 99% of robotics/gamep…

cs.CV

SAMOFT: Robust Multi-Object Tracking via Region and Flow

arXiv:2605.09417v1 Announce Type: new
Abstract: Multi-object tracking (MOT) is a fundamental task in computer vision that requires continuously tracking multiple targets while maintaining consistent identities across frames. However, most existing app…

cs.CV

Qwen-Image-2.0 Technical Report

arXiv:2605.10730v1 Announce Type: new
Abstract: We present Qwen-Image-2.0, an omni-capable image generation foundation model that unifies high-fidelity generation and precise image editing within a single framework. Despite recent progress, existing m…

Scroll to Top