MENU

Fun & Interesting

Transformer Neural Networks Derived from Scratch

Algorithmic Simplicity 171,999 2 years ago
Video Not Working? Fix It Now

#transformers #chatgpt #SoME3 #deeplearning Join me on a deep dive to understand the most successful neural network ever invented: the transformer. Transformers, originally invented for natural language translation, are now everywhere. They have fast taken over the world of machine learning (and the world more generally) and are now used for almost every application, not the least of which is ChatGPT. In this video I take a more constructive approach to explaining the transformer: starting from a simple convolutional neural network, I will step through all of the changes that need to be made, along with the motivations for why these changes need to be made. *By "from scratch" I mean "from a comprehensive mastery of the intricacies of convolutional neural network training dynamics". Here is a refresher on CNNs: https://www.youtube.com/watch?v=8iIdWHjleIs Chapters: 00:00 Intro 01:13 CNNs for text 05:28 Pairwise Convolutions 07:54 Self-Attention 13:39 Optimizations

Comment