AI RESEARCH
Faster Fixed-Point Methods for Multichain MDPs
arXiv CS.LG
•
ArXi:2506.20910v2 Announce Type: replace-cross We study value-iteration (VI) algorithms for solving general (a.k.a. multichain) Marko decision processes (MDPs) under the average-reward criterion, a fundamental but theoretically challenging setting.