Soft MPCritic: Amortized Model Predictive Value Iteration

ArXi:2604.01477v1 Announce Type: new Reinforcement learning (RL) and model predictive control (MPC) offer complementary strengths, yet combining them at scale remains computationally challenging. We propose soft MPCritic, an RL-MPC framework that learns in (soft) value space while using sample-based planning for both online control and value target generation.