cs.AI, cs.CL, cs.LG

Proximal Supervised Fine-Tuning

arXiv:2508.17784v2 Announce Type: replace-cross
Abstract: Supervised fine-tuning (SFT) of foundation models often leads to poor generalization, where prior capabilities deteriorate after tuning on new tasks or domains. Inspired by trust-region policy …