cs.AI

Reasoning Structure Matters for Safety Alignment of Reasoning Models

arXiv:2604.18946v1 Announce Type: new
Abstract: Large reasoning models (LRMs) achieve strong performance on complex reasoning tasks but often generate harmful responses to malicious user queries. This paper investigates the underlying cause of these s…