Self Attention in Transformers | Transformers in Deep Learning

Learn With Jay 8,122 6 months ago

Video Not Working? Fix It Now

We dive deep into the concept of Self Attention in Transformers! Self attention is a key mechanism that allows models like BERT and GPT to capture long-range dependencies within text, making them powerful for NLP tasks. We’ll break down how self attention in transformers works, looking at the math of how it generates a new word representation from embeddings. Whether you're new to Transformers or looking to strengthen your understanding, this video provides a clear and accessible explanation of Self Attention in Transformers with visuals and complete mathematics. ➖➖➖➖➖➖➖➖➖➖➖➖➖➖➖ Timestamps: 0:00 Intro 1:13 The Problem 4:00 Self Attention Overview 6:04 Self Attention Mathematics - Part 1 19:20 Self Attention as Gravity 20:07 Problems with the equation 26:51 Self Attention Complete 31:18 Benefits of Self Attention 34:30 Recap of Self Attention 38:53 Self Attention in the form of matrix multiplication 42:39 Outro ➖➖➖➖➖➖➖➖➖➖➖➖➖➖➖ Follow my entire Transformers playlist : 📕 Transformers Playlist: https://www.youtube.com/watch?v=lRylkiFdUdk&list=PLuhqtP7jdD8CQTxwVsuiFYGvHtFpNhlR3&pp=iAQB ➖➖➖➖➖➖➖➖➖➖➖➖➖➖➖ ✔ RNN Playlist: https://www.youtube.com/watch?v=lWPkNkShNbo&list=PLuhqtP7jdD8ARBnzj8SZwNFhwWT89fAFr ✔ CNN Playlist: https://www.youtube.com/watch?v=E5Z7FQp7AQQ&list=PLuhqtP7jdD8CD6rOWy20INGM44kULvrHu&t=0s ✔ Complete Neural Network: https://www.youtube.com/watch?v=mlk0rddP3L4&list=PLuhqtP7jdD8CftMk831qdE8BlIteSaNzD&t=0s ✔ Complete Logistic Regression Playlist: https://www.youtube.com/watch?v=U1omz0B9FTw&list=PLuhqtP7jdD8Chy7QIo5U0zzKP8-emLdny&t=0s ✔ Complete Linear Regression Playlist: https://www.youtube.com/watch?v=nwD5U2WxTdk&list=PLuhqtP7jdD8AFocJuxC6_Zz0HepAWL9cF&t=0s ➖➖➖➖➖➖➖➖➖➖➖➖➖➖➖

Comment