A Systematic Investigation of The RL-Jailbreaker in LLMs
arXiv:2605.07032v1 Announce Type: cross
Abstract: The evolution of generative models from next-token predictors to autonomous engines of complex systems necessitates rigorous safety hardening. Adversarial jailbreaking, the strategic manipulation of mo…