TimeRewarder: Learning Dense Reward from Passive Videos via Frame-wise Temporal Distance

ArXi:2509.26627v2 Announce Type: replace Designing dense rewards is crucial for reinforcement learning (RL), yet in robotics it often demands extensive manual effort and lacks scalability. One promising solution is to view task progress as a dense reward signal, as it quantifies the degree to which actions advance the system toward task completion over time. We present TimeRewarder, a simple yet effective reward learning method that derives progress estimation signals from passive videos, including robot nstrations and human videos, by modeling temporal distances between frame pairs.