Subscribe To My Channel https://www.youtube.com/@huseyin_ozdemir?sub_confirmation=1
Video Contents:
00:00 Drawbacks of Gradient Descent
02:39 Gradient Descent with Momentum
04:46 Momentum Parameter Default Value
05:22 Exponentially Decreasing Weights
09:48 Plot of Normalized Weights
10:08 Run Gradient Descent (without Momentum) with Low Learning Rate on a Second Order Surface
14:12 Problem When Running Gradient Descent (without Momentum) with High Learning Rate on a Second Order Surface
15:26 Run Gradient Descent (with Momentum) with High Learning Rate on a Second Order Surface
16:52 Nesterov's Accelerated Gradient
18:54 Another Implementation of Nesterov's Accelerated Gradient
20:10 Run Nesterov's Accelerated Gradient on a Second Order Surface
21:25 Performance Comparison of Gradient Descent, Gradient Descent with Momentum and NAG
* Drawbacks of Vanilla Gradient Descent
* Momentum calculation steps
* Calculating Velocity with exponentially decreasing weights
* Simulations of Gradient Descent with or without Momentum
* Side effect of Momentum
* Gradient Descent with Nesterov's Accelerated Gradient (NAG)
* Two different implementations of NAG
* Simulations of NAG method
* Comparison of Gradient Descent, Gradient Descent with Momentum and NAG
All images and animations in this video belong to me
References
On The Importance of Initialization and Momentum in Deep Learning
Ilya Sutskever, James Martens, George Dahl, Geoffrey Hinton
https://www.cs.toronto.edu/~fritz/absps/momentum.pdf
Advances in Optimizing Recurrent Networks
Yoshua Bengio, Nicolas Boulanger-Lewandowski, Razvan Pascanu
https://arxiv.org/abs/1212.0901
#machinelearning #computervision
#gradientdescent #deeplearning #ai
#optimization #education
#artificialintelligence #aitutorial
#convolutionalneuralnetwork #neuralnetwork
#convolutionalneuralnetworks #neuralnetworks
#imageprocessing #datascience
#computervisionwithhuseyinozdemir