cs.CL

When Choices Become Risks: Safety Failures of Large Language Models under Multiple-Choice Constraints

arXiv:2604.16916v1 Announce Type: new
Abstract: Safety alignment in large language models (LLMs) is primarily evaluated under open-ended generation, where models can mitigate risk by refusing to respond. In contrast, many real-world applications place…