Between a Rock and a Hard Place: The Tension Between Ethical Reasoning and Safety Alignment in LLMs
arXiv:2509.05367v4 Announce Type: replace-cross
Abstract: Large Language Model safety alignment predominantly operates on a binary assumption that requests are either safe or unsafe. This classification proves insufficient when models encounter ethica…