cs.AI, cs.CL, cs.LG

Knowledge-Level Consistency Reinforcement Learning: Dual-Fact Alignment for Long-Form Factuality

arXiv:2509.23765v3 Announce Type: replace
Abstract: Hallucination in large language models (LLMs) during long-form generation remains difficult to address under existing reinforcement learning from human feedback (RLHF) frameworks, as their preference…