cs.CL, cs.IR

Truncated Step-Level Sampling with Process Rewards for Retrieval-Augmented Reasoning

arXiv:2602.23440v3 Announce Type: replace
Abstract: Reinforcement learning has emerged as an effective paradigm for training large language models to interleave reasoning with search engine calls. However, existing approaches face a fundamental credit…