MENU

Fun & Interesting

Nesterov Accelerated Gradient from Scratch in Python

Video Not Working? Fix It Now

Momentum is great, however if the gradient descent steps could slow down when it gets to the bottom of a minima that would be even better. This is Nesterov Accelerated Gradient in a nutshell, check it out! Code can be found here: https://github.com/yacineMahdid/artificial-intelligence-and-machine-learning ## Credit Check out this blogpost for more gradient descent explanation: https://ruder.io/optimizing-gradient-descent/index.html#nesterovacceleratedgradient The music is taken from Youtube music! ## Table of Content - Introduction: - Theory: - Python Implementation: - Conclusion: Here is an explanation of Nesterov Accelerated Gradient from that very cool blogpost mentioned in the credit section (check it out!): "Nesterov accelerated gradient (NAG) [see reference] is a way to give our momentum term this kind of prescience. We know that we will use our momentum term γvt−1 to move the parameters θ. Computing θ−γvt−1 thus gives us an approximation of the next position of the parameters (the gradient is missing for the full update), a rough idea where our parameters are going to be. We can now effectively look ahead by calculating the gradient not w.r.t. to our current parameters θ but w.r.t. the approximate future position of our parameters:" ## Reference Nesterov, Y. (1983). A method for unconstrained convex minimization problem with the rate of convergence o(1/k2). Doklady ANSSSR (translated as Soviet.Math.Docl.), vol. 269, pp. 543– 547 ---- Join the Discord for general discussion: https://discord.gg/QpkxRbQBpf ---- Follow Me Online Here: Twitter: https://twitter.com/CodeThisCodeTh1 GitHub: https://github.com/yacineMahdid LinkedIn: https://www.linkedin.com/in/yacine-mahdid-809425163/ Instagram: https://www.instagram.com/yacine_mahdid/ ___ Have a great week! 👋

Comment