cs.CL, cs.LG

OutSafe-Bench: A Benchmark for Multimodal Offensive Content Detection in Large Language Models

arXiv:2511.10287v4 Announce Type: replace-cross
Abstract: Since Multimodal Large Language Models (MLLMs) are increasingly being integrated into everyday tools and intelligent agents, growing concerns have arisen regarding their possible output of unsa…