ARM-FM: Automated Reward Machines via Foundation Models for Compositional Reinforcement Learning

ArXi:2510.14176v3 Announce Type: replace-cross Reinforcement learning (RL) algorithms are highly sensitive to reward function specification, which remains a central challenge limiting their broad applicability. We present ARM-FM: Automated Reward Machines via Foundation Models, a framework for automated, compositional reward design in RL that leverages the high-level reasoning capabilities of foundation models (FMs