Continuous-time q-learning for mean-field control with common noise, part-I: Theoretical foundations

ArXi:2604.27372v1 Announce Type: cross This paper investigates the continuous-time counterpart of the Q-function for entropy-regularized mean-field control (MFC) with controlled common noise, coined as q-function by Jia and Zhou in the single agent's model. We first show that, under discretely sampled actions, the value function in the exploratory formulation converges to the one in the relaxed control formulation as the time grid refines.