cs.AI, cs.CL

Probing Ethical Framework Representations in Large Language Models: Structure, Entanglement, and Methodological Challenges

arXiv:2603.23659v1 Announce Type: cross
Abstract: When large language models make ethical judgments, do their internal representations distinguish between normative frameworks, or collapse ethics into a single acceptability dimension? We probe hidden …