A full history of Reinforcement Learning's development, from Mitchie's matchbox computer to modern robotic systems. Traces the evolution of key concepts through games and physical control problems, showing how simulation-trained skills transfer to reality through domain randomization. Explores the emergence of human-like behaviors in AI agents and raises profound questions about the relationship between actions and language. Examines cutting-edge developments in embodied AI, from Tesla's Optimus (Figure, Atlas) to OpenAI's dexterous manipulation, and considers the future of action prediction models inspired by large language models. A thought-provoking exploration of how robots develop physical intelligence and what this means for the future of AI.
Thanks to Jane Street for sponsoring this video. They are hiring people interested in ML! learn more about their work and open roles (and support me), visit their website: jane-st.co/ml
Featuring insights from:
Claude Shannon
Arthur Samuel
Gerald Tesauro
Richard Sutton
David Silver
Deep Mind/Open AI etc.
00:00 - Introduction
00:32 - Learning Tic Tac Toe
02:00 - Learning Cart and pole
04:20 - Shannon & Chess
06:50 - Samuel's Checkers
09:25 - TD Gammon (Gerald Tesaruo)
11:00 - TD Learning
14:30 - Learning Atari (DQN)
17:28 - DIrect Policy Gradiant
19:40 - Domain Randomization