Training LLMs with Reinforcement Learning for Intent-Aware Personalized Question Answering
arXiv:2605.12645v1 Announce Type: cross
Abstract: Effective personalized question answering (PQA) in language models requires grounding responses in the user’s underlying intent, where intent refers to the implicit “why” behind a query beyond its ex…