cs.CY, cs.LG

Making Bias Non-Predictive: Training Robust LLM Reasoning via Reinforcement Learning

arXiv:2602.01528v2 Announce Type: replace-cross
Abstract: Large language models (LLMs) increasingly serve as reasoners and automated evaluators, yet they remain susceptible to cognitive biases — often altering their reasoning when faced with spurious…