AI RESEARCH

Hitting Time Isomorphism for Multi-Stage Planning with Foundation Policies

arXiv CS.LG

ArXi:2605.06470v1 Announce Type: new We present a new operator-theoretic representation learning framework for offline reinforcement learning that recovers the directed temporal geometry of a controlled Marko process from hitting time observations. While prior art often produces symmetric distances or fails to satisfy the triangle inequality, our framework learns a Hilbert-space displacement geometry where expected hitting times are realized as linear functionals of latent displacements.