AI RESEARCH
Reward-Conditioned Reinforcement Learning
arXiv CS.LG
•
ArXi:2603.05066v2 Announce Type: replace Single-task RL agents are typically trained under a fixed reward function, which limits their robustness to reward misspecification and their ability to adapt to changing preferences. We