cs.AI, cs.CR

RouteGuard: Internal-Signal Detection of Skill Poisoning in LLM Agents

arXiv:2604.22888v1 Announce Type: cross
Abstract: Agent skills introduce a new and more severe form of indirect injection for LLM agents: unlike traditional indirect prompt injection, attackers can hide malicious instructions inside a dense, action-or…