Michael Ripa, Jim Davies

CounterMoral: Editing Morals in Language Models

Michael Ripa, Jim Davies / March 31, 2026

arXiv:2603.27338v1 Announce Type: new
Abstract: Recent advancements in language model technology have significantly enhanced the ability to edit factual information. Yet, the modification of moral judgments, a crucial aspect of aligning models with hu…

Author name: Michael Ripa, Jim Davies

CounterMoral: Editing Morals in Language Models