DARLING: Detection Augmented Reinforcement Learning with Non-Stationary Guarantees

ArXi:2604.16684v1 Announce Type: new We study model-free reinforcement learning (RL) in non-stationary finite-horizon episodic Marko decision processes (MDPs) without prior knowledge of the non-stationarity. We focus on the piecewise-stationary (PS) setting, where both the reward and transition dynamics can change an arbitrary number of times. We propose Detection Augmented Reinforcement Learning (DARLING), a modular wrapper for PS-RL that applies to both tabular and linear MDPs, without knowledge of the changes.