cs.AI, cs.LG

Where Paths Split: Localized, Calibrated Control of Moral Reasoning in Large Language Models

arXiv:2605.03609v1 Announce Type: cross
Abstract: Large language models often display heterogeneous moral preferences across settings. We study inference-time steering toward a desired ethical framework while preserving general competence. We present …