Impact of Markov Decision Process Design on Sim-to-Real Reinforcement Learning

ArXi:2603.09427v1 Announce Type: new Reinforcement Learning (RL) has nstrated strong potential for industrial process control, yet policies trained in simulation often suffer from a significant sim-to-real gap when deployed on physical hardware. This work systematically analyzes how core Marko Decision Process (MDP) design choices -- state composition, target inclusion, reward formulation, termination criteria, and environment dynamics models -- affect this transfer.