cs.CL

Open-Domain Safety Policy Construction

arXiv:2604.01354v1 Announce Type: new
Abstract: Moderation layers are increasingly a core component of many products built on user- or model-generated content. However, drafting and maintaining domain-specific safety policies remains costly. We presen…