Generalized Distributional Alignment Games for Unbiased Answer-Level Fine-Tuning
arXiv:2605.02435v1 Announce Type: cross
Abstract: The Distributional Alignment Game framework provides a powerful variational perspective on Answer-Level Fine-Tuning (ALFT). However, standard algorithms for these games rely on estimating logarithmic r…