RationalRewards: Reasoning Rewards Scale Visual Generation Both Training and Test Time
arXiv:2604.11626v2 Announce Type: replace
Abstract: Most reward models for visual generation reduce rich human judgments to a single unexplained score, discarding the reasoning that underlies preference. We show that teaching reward models to produce …