Learning The Minimum Action Distance

ArXi:2506.09276v3 Announce Type: replace-cross This paper presents a state representation framework for Marko decision processes (MDPs) that can be learned solely from state trajectories, requiring neither reward signals nor the actions executed by the agent. We propose learning the minimum action distance (MAD), defined as the minimum number of actions required to transition between states, as a fundamental metric that captures the underlying structure of an environment.