Towards Online Multi-Modal Social Interaction Understanding
arXiv:2503.19851v2 Announce Type: replace
Abstract: In this paper, we introduce a new problem, Online-MMSI, where the model must perform multimodal social interaction understanding (MMSI) using only historical information. Given a recorded video and a…