AI RESEARCH
[R] A Gradient Descent Misalignment — Causes Normalisation To Emerge
r/MachineLearning
•
This paper, just accepted at the GRaM