cs.CL, cs.CV

Through the Lens of Character: Resolving Modality-Role Interference in Multimodal Role-Playing Agent

arXiv:2605.09443v1 Announce Type: new
Abstract: The advancement of Multimodal Large Language Models (MLLMs) has expanded Role-Playing Agents (RPAs) into visually grounded environments. However, human vision is inherently subjective and identity-driven…