I follow the journey that led to the explosion of Large Language Models. From Jordan's pioneering work in 1986 to today's GPT-4, this documentary traces how AI learned to talk. Featuring insights from AI pioneers including Chomsky, Hofstadter, Hinton, and LeCun, exploring the revolutionary concepts that made ChatGPT possible: transformer architecture, attention mechanism, next-token prediction, and emergent capabilities. Next video following open ai's o1 model
My script, references & visualizations here: https://docs.google.com/document/d/1s7FNPoKPW9y3EhvzNgexJaEG2pP4Fx_rmI4askoKZPA
Consider joining my channel as a YouTube member: https://www.youtube.com/channel/UCotwjyJnb-4KW7bmsOoLfkg/join
This is part of "The Pattern Machine" series: https://www.youtube.com/watch?v=YulgDAaHBKw&list=PLbg3ZX2pWlgKV8K6bFJr5dhM7oOClExUJ
TIMESTAMPS:
00:00 - Introduction
00:32 - hofstader's thoughts on chatGPT
01:00 - recap of supervised learning
01:55 - first paper on sequential learning
02:55 - first use of state units (RNN)
04:33 - first observation of word boundary detection
05:30 - first observation of word clustering
07:16 - first "large" language model Hinton/Sutskever
10:10 - sentiment neuron (Ilya | OpenAI)
12:30 - transformer explanation
15:50 - GPT-1
17:00 - GPT-2
17:55 - GPT-3
18:20 - In-context learning
19:40 - ChatGPT / GPT-4
21:10 - tool use
23:25 - philosophical question: what is thought?