Value-Decomposed Reinforcement Learning Framework for Taxiway Routing with Hierarchical Conflict-Aware Observations

ArXi:2605.08754v1 Announce Type: new Taxiway routing and on-surface conflict avoidance are coupled safety-critical decision problems in airport surface operations. Existing planning and optimization methods are often limited by online computational cost, while reinforcement learning methods may struggle to represent downstream traffic conflicts and balance multiple objectives. This paper presents Conflict-aware Taxiway Routing (CaTR), a reinforcement learning framework for real-time multi-aircraft taxiway routing. CaTR constructs a grid-based airport surface environment with action masking.