cs.AI

Policy-Grounded Safety Evaluation of 20 Large Language Models

arXiv:2507.14719v2 Announce Type: replace
Abstract: As large language models (LLMs) become increasingly integrated into real-world applications, scalable and rigorous safety evaluation is essential. This paper introduces Aymara AI, a programmatic plat…