cs.AI

SARL: Label-Free Reinforcement Learning by Rewarding Reasoning Topology

arXiv:2603.27977v1 Announce Type: new
Abstract: Reinforcement learning has become central to improving large reasoning models, but its success still relies heavily on verifiable rewards or labeled supervision. This limits its applicability to open end…