Xuan Gong, Senmiao Wang, Hanbo Huang, Ruoyu Sun, Shiyu Liang

VCORE: Variance-Controlled Optimization-based Reweighting for Chain-of-Thought Supervision

Xuan Gong, Senmiao Wang, Hanbo Huang, Ruoyu Sun, Shiyu Liang / April 21, 2026

arXiv:2510.27462v2 Announce Type: replace
Abstract: Supervised fine-tuning (SFT) on long chain-of-thought (CoT) trajectories has emerged as a crucial technique for enhancing the reasoning abilities of large language models (LLMs). However, the standar…

Author name: Xuan Gong, Senmiao Wang, Hanbo Huang, Ruoyu Sun, Shiyu Liang

VCORE: Variance-Controlled Optimization-based Reweighting for Chain-of-Thought Supervision