cs.AI

CounterMoral: Editing Morals in Language Models

arXiv:2603.27338v1 Announce Type: new
Abstract: Recent advancements in language model technology have significantly enhanced the ability to edit factual information. Yet, the modification of moral judgments, a crucial aspect of aligning models with hu…