Continuous-time q-learning for mean-field control with common noise, part-II: q-learning algorithms

ArXi:2604.27378v1 Announce Type: cross This paper is a continuation work of Ren aiming to further devise q-learning algorithms for mean-field control (MFC) with controlled common noise. Based on the relaxed control formulation, we first establish the martingale condition of the value function and the Iq-function by evaluating along the conditional state distributions generated by all test policies.