PiCA: Pivot-Based Credit Assignment for Search Agentic Reinforcement Learning
arXiv:2605.09287v2 Announce Type: new
Abstract: Large Language Model (LLM)-based search agents trained with reinforcement learning (RL) have significantly improved the performance of knowledge-intensive tasks. However, existing methods encounter criti…