AutoRubric-T2I: Robust Rule-Based Reward Model for Text-to-Image Alignment

ArXi:2605.17602v1 Announce Type: cross Aligning Text-to-Image (T2I) generation models with human preferences increasingly relies on image reward models that score or rank generated images according to prompt alignment and perceptual quality. Existing reward models are commonly trained as Bradley-Terry (BT) preference models on large-scale human preference corpora, making them costly to train, difficult to adapt, and opaque in their evaluation criteria.