RL-STPA: Adapting System-Theoretic Hazard Analysis for Safety-Critical Reinforcement Learning

ArXi:2604.15201v1 Announce Type: new As reinforcement learning (RL) deployments expand into safety-critical domains, existing evaluation methods fail to systematically identify hazards arising from the black-box nature of neural network enabled policies and distributional shift between