InfoSFT: Learn More and Forget Less with Information-Aware Token Weighting
arXiv:2605.14967v1 Announce Type: cross
Abstract: Supervised fine-tuning (SFT) provides the standard approach for teaching LLMs new behaviors from offline expert demonstrations. However, standard SFT uniformly fits all samples — including those with …