Sindhuja Chaduvula, Ahmed Y. Radwan, Azib Farooq, Yani Ioannou, Shaina Raza

Reducing Hallucinations in LLMs via Factuality-Aware Preference Learning

Sindhuja Chaduvula, Ahmed Y. Radwan, Azib Farooq, Yani Ioannou, Shaina Raza / April 16, 2026

arXiv:2601.03027v3 Announce Type: replace
Abstract: Preference alignment methods such as RLHF and Direct Preference Optimization (DPO) improve instruction following, but they can also reinforce hallucinations when preference judgments reward fluency a…

Author name: Sindhuja Chaduvula, Ahmed Y. Radwan, Azib Farooq, Yani Ioannou, Shaina Raza

Reducing Hallucinations in LLMs via Factuality-Aware Preference Learning