AI RESEARCH
Generative Actor-Critic with Soft Bridge Policies
arXiv CS.LG
•
ArXi:2605.08733v1 Announce Type: new Expressive generative policies such as diffusion and flow models are appealing for MaxEnt online reinforcement learning because of their ability to model multimodal and highly non-Gaussian action distributions. However