WIMLE: Uncertainty-Aware World Models with IMLE for Sample-Efficient Continuous Control

ArXi:2602.14351v2 Announce Type: replace-cross Model-based reinforcement learning promises strong sample efficiency but often underperforms in practice due to compounding model error, unimodal world models that average over multi-modal dynamics, and overconfident predictions that bias learning. We