STAR-Teaming: A Strategy-Response Multiplex Network Approach to Automated LLM Red Teaming
arXiv:2604.18976v1 Announce Type: new
Abstract: While Large Language Models (LLMs) are widely used, they remain susceptible to jailbreak prompts that can elicit harmful or inappropriate responses. This paper introduces STAR-Teaming, a novel black-box …