cs.AI

SPARK: Self-Play with Asymmetric Reward from Knowledge Graphs

arXiv:2605.05546v1 Announce Type: new
Abstract: Self-play reinforcement learning has shown strong performance in domains with formally verifiable structure, such as mathematics and coding, where both problem generation and reward computation can be gr…