How do Transformer Models keep track of the order of words? Positional Encoding

Serrano.Academy 6,171 5 months ago

Video Not Working? Fix It Now

Transformer models can generate language really well, but how do they do it? A very important step of the pipeline is the Positional Encoding step, which injects the order of the words into the LLM, in order for it to tell the difference between sentences with the same words, but in different order. This is part of a series of videos on Large Language Models: Video 1: The attention mechanism https://www.youtube.com/watch?v=OxCpWwDCDFQ Video 2: The math behind the attention mechanism https://www.youtube.com/watch?v=UPtG_38Oq8o Video 3: Transformer models https://www.youtube.com/watch?v=qaWMOYf4ri8 Grokking Machine Learning Book: https://www.manning.com/books/grokking-machine-learning 40% discount promo code: serranoyt 00:00 Introduction 03:01 How much to add to each word? 03:29 Some trigonometry 06:20 The formulas

Comment