RouteGuard: Internal-Signal Detection of Skill Poisoning in LLM Agents
arXiv:2604.22888v1 Announce Type: cross
Abstract: Agent skills introduce a new and more severe form of indirect injection for LLM agents: unlike traditional indirect prompt injection, attackers can hide malicious instructions inside a dense, action-or…