Arnon Mazza, Elad Levi

BARRED: Synthetic Training of Custom Policy Guardrails via Asymmetric Debate

Arnon Mazza, Elad Levi / April 29, 2026

arXiv:2604.25203v1 Announce Type: new
Abstract: Deploying guardrails for custom policies remains challenging, as generic safety models fail to capture task-specific requirements, while prompting LLMs suffers from inconsistent boundary-case performance…

Author name: Arnon Mazza, Elad Levi

BARRED: Synthetic Training of Custom Policy Guardrails via Asymmetric Debate