cs.AI, cs.LG

Reasoning Through Chess: How Reasoning Evolves from Data Through Fine-Tuning and Reinforcement Learning

arXiv:2604.05134v1 Announce Type: cross
Abstract: How can you get a language model to reason in a task it natively struggles with? We study how reasoning evolves in a language model — from supervised fine-tuning (SFT) to reinforcement learning (RL) -…