cs.CV

A Heterogeneous Two-Stream Framework for Video Action Recognition with Comparative Fusion Analysis

arXiv:2604.23415v1 Announce Type: new
Abstract: Most two-stream action recognition networks apply the same convolutional backbone to both RGB and optical flow streams, ignoring the fact that the two modalities have fundamentally different structural p…