Beyond Distribution Sharpening: The Importance of Task Rewards
arXiv:2604.16259v1 Announce Type: cross
Abstract: Frontier models have demonstrated exceptional capabilities following the integration of task-reward-based reinforcement learning (RL) into their training pipelines, enabling systems to evolve from pure…