Reinforcement Learning has helped train neural networks to win games, drive cars and even get ChatGPT to sound more human when it responds to your prompt. This StatQuest covers the essential concepts of how this process works. BAM!
If you'd like to support StatQuest, please consider...
Patreon: https://www.patreon.com/statquest
...or...
YouTube Membership: https://www.youtube.com/channel/UCtYLUTtgS3k1Fg4y5tAhLbw/join
...buying a book, a study guide, a t-shirt or hoodie, or a song from the StatQuest store...
https://statquest.org/statquest-store/
...or just donating to StatQuest!
paypal: https://www.paypal.me/statquest
venmo: @JoshStarmer
Lastly, if you want to keep up with me as I research and create new StatQuests, follow me on twitter:
https://twitter.com/joshuastarmer
0:00 Awesome song and introduction
4:01 Backpropagation review
6:25 The problem with standard backpropagation
7:04 Taking a guess to calculate the derivative
11:20 Using a reward to update the derivative
14:56 Alternative rewards
16:01 Updating a parameter with the updated derivative
16:56 A second example
22:05 Summary
#StatQuest