RationalRewards: Reasoning Rewards Scale Visual Generation Both Training and Test Time

ArXi:2604.11626v1 Announce Type: new Most reward models for visual generation reduce rich human judgments to a single unexplained score, discarding the reasoning that underlies preference. We show that teaching reward models to produce explicit, multi-dimensional critiques before scoring transforms them from passive evaluators into active optimization tools, improving generators in two complementary ways: at