Internalizing Safety Understanding in Large Reasoning Models via Verification
arXiv:2605.08930v1 Announce Type: new
Abstract: While explicit Chain-of-Thought (CoT) empowers large reasoning models (LRMs), it enables the generation of riskier final answers. Current alignment paradigms primarily rely on externally enforced complia…