Yifan Xu, Junren Chen, Yifan Chen

How You Begin is How You Reason: Driving Exploration in RLVR via Prefix-Tuned Priors

Yifan Xu, Junren Chen, Yifan Chen / May 12, 2026

arXiv:2605.08817v1 Announce Type: new
Abstract: Reinforcement learning with verifiable rewards (RLVR) recently thrives in large language model (LLM) reasoning tasks. However, the reward sparsity and the long reasoning horizon make effective exploratio…

Author name: Yifan Xu, Junren Chen, Yifan Chen

How You Begin is How You Reason: Driving Exploration in RLVR via Prefix-Tuned Priors