When Backdoors Meet Partial Observability: Attacking Real-World Reinforcement Learning

ArXi:2601.14104v2 Announce Type: replace-cross Backdoor attacks can cause reinforcement learning (RL) policies to behave normally under clean inputs while executing malicious behaviors when triggers are present. Existing RL backdoor attacks are primarily studied in simulation and often assume that attackers can reliably manipulate the observations driving policy decisions. This assumption becomes fragile in real-world deployment, where RL policies commonly rely on multimodal observations.