Switching Successor Measures for Hierarchical Zero-shot Reinforcement Learning

ArXi:2605.13207v1 Announce Type: new Hierarchical reinforcement learning can improve generalization by decomposing long-horizon decision-making into simpler subproblems. However, existing approaches often rely on restrictive design choices, such as fixed temporal abstractions or goal-conditioned objectives, which largely confine them to goal-reaching tasks and limit their applicability to general reward functions. In this paper, we