Back to Blackwell: Closing the Loop on Intransitivity in Multi-Objective Preference Fine-Tuning
arXiv:2602.19041v2 Announce Type: replace
Abstract: A recurring challenge in preference fine-tuning (PFT) is handling $\textit{intransitive}$ (i.e., cyclic) preferences. Intransitive preferences often stem from either $\textit{(i)}$ inconsistent ranki…