AI RESEARCH

MATE: Solving Contextual Markov Decision Processes with Memory of Accumulated Transition Embeddings

arXiv CS.LG

ArXi:2605.17431v1 Announce Type: new We propose MATE, a simple yet effective memory architecture for solving Contextual Marko Decision Processes (CMDPs), a family of MDPs parameterized by an unobserved context. In CMDPs, an optimal agent can adapt online by maintaining the posterior belief over contexts. MATE replaces this intractable posterior with a sum-aggregated memory, leveraging the posterior's permutation invariance to retain provably sufficient expressiveness.