cs.AI, cs.CR

FreakOut-LLM: The Effect of Emotional Stimuli on Safety Alignment

arXiv:2604.04992v1 Announce Type: cross
Abstract: Safety-aligned LLMs go through refusal training to reject harmful requests, but whether these mechanisms remain effective under emotionally charged stimuli is unexplored. We introduce FreakOut-LLM, a f…