Author name: Mahdi Sabbaghi, George Pappas, Adel Javanmard, Hamed Hassani

InfoSFT: Learn More and Forget Less with Information-Aware Token Weighting

Mahdi Sabbaghi, George Pappas, Adel Javanmard, Hamed Hassani / May 15, 2026

arXiv:2605.14967v1 Announce Type: cross
Abstract: Supervised fine-tuning (SFT) provides the standard approach for teaching LLMs new behaviors from offline expert demonstrations. However, standard SFT uniformly fits all samples — including those with …

cs.LG

Robust Policy Optimization to Prevent Catastrophic Forgetting

Mahdi Sabbaghi, George Pappas, Adel Javanmard, Hamed Hassani / May 13, 2026

arXiv:2602.08813v2 Announce Type: replace
Abstract: Large language models are commonly trained through multi-stage post-training: first via RLHF, then fine-tuned for other downstream objectives. Yet even small downstream updates can compromise earlier…