Jailbroken Frontier Models Retain Their Capabilities
arXiv:2605.00267v1 Announce Type: cross
Abstract: As language model safeguards become more robust, attackers are pushed toward developing increasingly complex jailbreaks. Prior work has found that this complexity imposes a “jailbreak tax” that degrade…