AI RESEARCH

Themis: Training Robust Multilingual Code Reward Models for Flexible Multi-Criteria Scoring

arXiv CS.LG • May 04, 2026

ArXi:2605.00754v1 Announce Type: cross Reward models (RMs) have become an indispensable fixture of the language model (LM) post-