PA2D-MORL: Pareto Ascent Directional Decomposition based Multi-Objective Reinforcement Learning

ArXi:2603.19579v1 Announce Type: new Multi-objective reinforcement learning (MORL) provides an effective solution for decision-making problems involving conflicting objectives. However, achieving high-quality approximations to the Pareto policy set remains challenging, especially in complex tasks with continuous or high-dimensional state-action space.