Moltbook Moderation: Uncovering Hidden Intent Through Multi-Turn Dialogue
arXiv:2605.12856v2 Announce Type: replace
Abstract: The emergence of multi-agent systems introduces novel moderation challenges that extend beyond content filtering. Agents with malicious intent may contribute harmful content that appears benign to ev…