Buck - Provide.ai

Uncategorised

A review of “Investigating the consequences of accidentally grading CoT during RL”

Buck / May 7, 2026

Last week, OpenAI staff shared an early draft of Investigating the consequences of accidentally grading CoT during RL with Redwood Research staff.To start with, I appreciate them publishing this post. I think it is valuable for AI companies to be trans…

Author name: Buck

A review of “Investigating the consequences of accidentally grading CoT during RL”