Understanding Reinforcement Learning with Neural Networks Part 2: Why Backpropagation Is Not Enough

In the previous article, we explored an example where reinforcement learning is required and standard methods do not work. In this article, we will understand why policy gradients are needed, and why the standard backpropagation method does not work in certain situations. How Backpropagation Normally Works Assume we have the following