AI RESEARCH
ROAD: Adaptive Data Mixing for Offline-to-Online Reinforcement Learning via Bi-Level Optimization
arXiv CS.LG
•
ArXi:2605.14497v1 Announce Type: new Offline-to-online reinforcement learning harnesses the stability of offline pre