Explainable Representation of Finite-Memory Policies for POMDPs using Decision Trees

ArXi:2411.13365v2 Announce Type: replace Partially Observable Marko Decision Processes (POMDPs) are a fundamental framework for decision-making under uncertainty and partial observability. Since in general optimal policies may require infinite memory, they are hard to implement and often render most problems undecidable. Consequently, finite-memory policies are mostly considered instead. However, the algorithms for computing them are typically very complex, and so are the resulting policies.