Efficient Cross-Domain Offline Reinforcement Learning with Dynamics- and Value-Aligned Data Filtering

ArXi:2512.02435v3 Announce Type: replace Cross-domain offline reinforcement learning (RL) aims to train a well-performing agent in the target environment, leveraging both a limited target domain dataset and a source domain dataset with (possibly) sufficient data coverage. Due to the underlying dynamics misalignment between source and target domains, naively merging the two datasets may incur inferior performance. Recent advances address this issue by selectively leveraging source domain samples whose dynamics align well with the target domain.