Information-Consistent Language Model Recommendations through Group Relative Policy Optimization
arXiv:2512.12858v3 Announce Type: replace
Abstract: Large Language Models (LLMs) are increasingly deployed in business-critical domains such as finance, education, healthcare, and customer support, where users expect consistent and reliable recommenda…