CoLoRSMamba: Conditional LoRA-Steered Mamba for Supervised Multimodal Violence Detection
arXiv:2604.03329v1 Announce Type: new
Abstract: Violence detection benefits from audio, but real-world soundscapes can be noisy or weakly related to the visible scene. We present CoLoRSMamba, a directional Video to Audio multimodal architecture that c…