Sparse Offline Reinforcement Learning with Corruption Robustness

ArXi:2512.24768v2 Announce Type: replace-cross We investigate robustness to strong data corruption in offline sparse reinforcement learning (RL). In our setting, an adversary may arbitrarily perturb a fraction of the collected trajectories from a high-dimensional but sparse Marko decision process, and our goal is to estimate a near optimal policy.