With the ongoing buzz around DeepSeek-R1, and other Reasoning models, folks are looking for a quick, but digestible, explanation how these models get trained.
We give a simple explanation of the building blocks of a Large Language Model (LLM). Then we describe the three phases of training that turn LLMs from generators of gibberish to producers of cogent responses.
Best heard at 1.25x or 1.5x Speed
@semilearned follow-on video: “Why is DeepSeek R1 so Fast and Cheap?”:
https://www.youtube.com/watch?v=Z734WC8oeGM
Chapters:
00:00 Introduction
00:40 From Randomness to Order
01:44 Supervised Learning and Fine Tuning
04:10 The Magic of Reinforcement Learning
07:35 Summary of How LLMs get Trained