Author name: Mingyi Liu

The Alignment Tax: Response Homogenization in Aligned LLMs and Its Implications for Uncertainty Estimation

Mingyi Liu / March 30, 2026

arXiv:2603.24124v2 Announce Type: replace-cross
Abstract: RLHF-aligned language models exhibit response homogenization: on TruthfulQA (n=790), 40-79% of questions produce a single semantic cluster across 10 i.i.d. samples. On affected questions, sampl…

cs.AI, cs.CL, cs.LG

The Alignment Tax: Response Homogenization in Aligned LLMs and Its Implications for Uncertainty Estimation

Mingyi Liu / March 26, 2026

arXiv:2603.24124v1 Announce Type: new
Abstract: RLHF-aligned language models exhibit response homogenization: on TruthfulQA (n=790), 40-79% of questions produce a single semantic cluster across 10 i.i.d. samples. On affected questions, sampling-based …