cs.AI, cs.CL

Exploring Reasoning Reward Model for Agents

arXiv:2601.22154v2 Announce Type: replace-cross
Abstract: Agentic Reinforcement Learning (Agentic RL) has achieved notable success in enabling agents to perform complex reasoning and tool use. However, most methods still relies on sparse outcome-based…

Scroll to Top