Tracking Drift: Variation-Aware Entropy Scheduling for Non-Stationary Reinforcement Learning

ArXi:2601.19624v2 Announce Type: replace Real-world reinforcement learning often faces environment drift, but most existing methods rely on static entropy coefficients/target entropy, causing over-exploration during stable periods and under-exploration after drift, and leaving unanswered the principled question of how exploration intensity should scale with drift magnitude.