A Modularized Framework for Piecewise-Stationary Restless Bandits

ArXi:2604.10177v1 Announce Type: cross We study the piecewise-stationary restless multi-armed bandit (PS-RMAB) problem, where each arm evolves as a Marko chain but \emph{mean rewards may change across unknown segments}. To address the resulting exploration--detection delay trade-off, we propose a modular framework that integrates arbitrary RMAB base algorithms with change detection and a novel diminishing exploration mechanism. This design enables flexible plug-and-play use of existing solvers and detectors, while efficiently adapting to mean changes without prior knowledge of their number.