The attention mechanism is well known for its use in Transformers. But where does it come from? It's origins lie in fixing a strange problems of RNNs.
Support me on Patreon! https://patreon.com/vcubingx
Language Modeling Playlist: https://youtube.com/playlist?list=PLyPKqVSnetmELS_I3FRfXZRKAxV5HB9fc&si=IRBpmEtun0xX7X9_
3blue1brown series on Transformers: https://youtu.be/wjZofJX0v4M
The source code for the animations can be found here:
https://github.com/vivek3141/dl-visualization
These animation in this video was made using 3blue1brown's library, manim:
https://github.com/3b1b/manim
Sources (includes the entire series): https://docs.google.com/document/d/1e9dB6Q-A6z1d33w2m0_DrMS5EdbSunY2TPvZmZf41Gc/edit
Chapters
0:00 Introduction
0:22 Machine Translation
2:01 Attention Mechanism
8:04 Outro
Music (In Order):
Helynt - Route 10
Helynt - Bo-Omb Battlefield
Helynt - Underwater
Philanthrope, mommy - embrace https://chll.to/7e941f72
Helynt - Twinleaf Town
Follow me!
Website: https://vcubingx.com
Twitter: https://twitter.com/vcubingx
Github: https://github.com/vivek3141
Instagram: https://instagram.com/vcubingx
Patreon: https://patreon.com/vcubingx