Chain of Uncertain Rewards with Large Language Models for Reinforcement Learning

ArXi:2604.13504v1 Announce Type: cross Designing effective reward functions is a cornerstone of reinforcement learning (RL), yet it remains a challenging and labor-intensive process due to the inefficiencies and inconsistencies inherent in traditional methods. Existing methods often rely on extensive manual design and evaluation steps, which are prone to redundancy and overlook local uncertainties at intermediate decision points.