Mehryar Mohri, Jon Schneider, Yifan Wu

Distributional Alignment Games for Answer-Level Fine-Tuning

Mehryar Mohri, Jon Schneider, Yifan Wu / May 1, 2026

arXiv:2604.27166v1 Announce Type: new
Abstract: We focus on the problem of \emph{Answer-Level Fine-Tuning} (ALFT), where the goal is to optimize a language model based on the correctness or properties of its final answers, rather than the specific rea…

Author name: Mehryar Mohri, Jon Schneider, Yifan Wu

Distributional Alignment Games for Answer-Level Fine-Tuning