Offline Policy Evaluation for Manipulation Policies via Discounted Liveness Formulation

ArXi:2605.11479v1 Announce Type: cross Policy evaluation is a fundamental component of the development and deployment pipeline for robotic policies. In modern manipulation systems, this problem is particularly challenging: rewards are often sparse, task progression of evaluation rollouts are often non-monotonic as the policies exhibit recovery behaviors, and evaluation rollouts are necessarily of finite length. This finite length