cs.AI, cs.LG

A Deep Dive into Scaling RL for Code Generation with Synthetic Data and Curricula

arXiv:2603.24202v1 Announce Type: new
Abstract: Reinforcement learning (RL) has emerged as a powerful paradigm for improving large language models beyond supervised fine-tuning, yet sustaining performance gains at scale remains an open challenge, as d…