AI RESEARCH
Can Adaptive Gradient Methods Converge under Heavy-Tailed Noise? A Case Study of AdaGrad
arXiv CS.LG
•
ArXi:2605.18694v1 Announce Type: cross Many tasks in modern machine learning are observed to involve heavy-tailed gradient noise during the optimization process. To manage this realistic and challenging setting, new mechanisms, such as gradient clipping and gradient normalization, have been