Faster Fixed-Point Methods for Multichain MDPs

ArXi:2506.20910v2 Announce Type: replace-cross We study value-iteration (VI) algorithms for solving general (a.k.a. multichain) Marko decision processes (MDPs) under the average-reward criterion, a fundamental but theoretically challenging setting.