cs.CL, cs.CR

SRTJ: Self-Evolving Rule-Driven Training-Free LLM Jailbreaking

arXiv:2605.00974v1 Announce Type: cross
Abstract: LLMs are increasingly equipped with safety alignment mechanisms, yet recent studies demonstrate that they remain vulnerable to jailbreaking attacks that elicit harmful behaviors without explicit policy…