Beyond Suffixes: Token Position in GCG Adversarial Attacks on Large Language Models
arXiv:2602.03265v2 Announce Type: replace
Abstract: Large Language Models (LLMs) have seen widespread adoption across multiple domains, creating an urgent need for robust safety alignment mechanisms. However, robustness remains challenging due to jail…