When Actions Disappear: Adversarial Action Removal in Self-Play Reinforcement Learning

ArXi:2605.16312v1 Announce Type: new We study adversarial action masking in self-play reinforcement learning: an attacker selectively removes legal actions from a victim's action set. Unlike observation or action perturbations, removal eliminates decision options before the agent acts. Across poker games scaling from 6 to 5,531 information states and two non-poker domains, learned masking causes substantially damage than random masking and learned perturbation baselines.