cs.AI, cs.CL, cs.LG

SHAPE: Stage-aware Hierarchical Advantage via Potential Estimation for LLM Reasoning

arXiv:2604.06636v1 Announce Type: cross
Abstract: Process supervision has emerged as a promising approach for enhancing LLM reasoning, yet existing methods fail to distinguish meaningful progress from mere verbosity, leading to limited reasoning capab…