Improved Robustness of Deep Reinforcement Learning for Control of Time-Varying Systems by Bounded Extremum Seeking

ArXi:2510.02490v2 Announce Type: replace In this paper, we study the use of robust model independent bounded extremum seeking (ES) feedback control to improve the robustness of deep reinforcement learning (DRL) controllers for a class of nonlinear time-varying systems. DRL has the potential to learn from large datasets to quickly control or optimize the outputs of many-parameter systems, but its performance degrades catastrophically when the system model changes rapidly over time.