Augmented Lagrangian Multiplier Network for State-wise Safety in Reinforcement Learning

ArXi:2605.00667v1 Announce Type: cross Safety is a primary challenge in real-world reinforcement learning (RL). Formulating safety requirements as state-wise constraints has become a prominent paradigm. Handling state-wise constraints with the Lagrangian method requires a distinct multiplier for every state, necessitating neural networks to approximate them as a multiplier network. However, applying standard dual gradient ascent to multiplier networks induces severe