cs.AI, cs.CL

ProFit: Leveraging High-Value Signals in SFT via Probability-Guided Token Selection

arXiv:2601.09195v2 Announce Type: replace-cross
Abstract: Supervised fine-tuning (SFT) is a fundamental post-training strategy to align Large Language Models (LLMs) with human intent. However, traditional SFT often ignores the one-to-many nature of la…