Author name: Xingyuan Hua, Sheng Yue, Ju Ren

Learning to Explore: Scaling Agentic Reasoning via Exploration-Aware Policy Optimization

Xingyuan Hua, Sheng Yue, Ju Ren / May 13, 2026

arXiv:2605.08978v2 Announce Type: new
Abstract: Recent advancements in agentic test-time scaling allow models to gather environmental feedback before committing to final actions. A key limitation of existing methods is that they typically employ undif…

cs.AI

Learning to Explore: Scaling Agentic Reasoning via Exploration-Aware Policy Optimization

Xingyuan Hua, Sheng Yue, Ju Ren / May 12, 2026

arXiv:2605.08978v1 Announce Type: new
Abstract: Recent advancements in agentic test-time scaling allow models to gather environmental feedback before committing to final actions. A key limitation of existing methods is that they typically employ undif…