Explanation Quality Assessment as Ranking with Listwise Rewards
arXiv:2604.24176v1 Announce Type: new
Abstract: We reformulate explanation quality assessment as a ranking problem rather than a generation problem. Instead of optimizing models to produce a single “best” explanation token-by-token, we train reward mo…