cs.AI, cs.CR, cs.LG

Defense effectiveness across architectural layers: a mechanistic evaluation of persistent memory attacks on stateful LLM agents

arXiv:2605.08442v1 Announce Type: cross
Abstract: Persistent memory attacks against LLM agents achieve high attack success rates against open-source models. In these attacks, malicious instructions injected via RAG-retrieved documents are stored in pe…