Samee Arif, Naihao Deng, Zhijing Jin, Rada Mihalcea

One Word at a Time: Incremental Completion Decomposition Breaks LLM Safety

Samee Arif, Naihao Deng, Zhijing Jin, Rada Mihalcea / April 30, 2026

arXiv:2604.25921v1 Announce Type: new
Abstract: Large Language Models (LLMs) are trained to refuse harmful requests, yet they remain vulnerable to jailbreak attacks that exploit weaknesses in conversational safety mechanisms. We introduce Incremental …

Author name: Samee Arif, Naihao Deng, Zhijing Jin, Rada Mihalcea

One Word at a Time: Incremental Completion Decomposition Breaks LLM Safety