cs.LG

DRAFT: Task Decoupled Latent Reasoning for Agent Safety

arXiv:2604.03242v1 Announce Type: new
Abstract: The advent of tool-using LLM agents shifts safety monitoring from output moderation to auditing long, noisy interaction trajectories, where risk-critical evidence is sparse-making standard binary supervi…