Integrating Causal DAGs in Deep RL: Activating Minimal Markovian States with Multi-Order Exposure

ArXi:2605.07057v1 Announce Type: new Online reinforcement learning (RL) relies on the Marko property for guaranteed performance, but real-world applications often lack well-defined states given raw observed variables.