Improved Learning Rates for Stochastic Optimization

ArXi:2107.08686v3 Announce Type: replace Stochastic optimization is a cornerstone of modern machine learning. This paper studies the generalization performance of two classical stochastic optimization algorithms: stochastic gradient descent (SGD) and Nestero's accelerated gradient (NAG). We establish new learning rates for both algorithms, with improved guarantees in some settings or comparable rates under weaker assumptions in others. We also provide numerical experiments to the theory.