In this video, we build a Generalist Robotics Policy from scratch. Generalist Robotics Policies are large models for robotics that are trained using large amounts of interaction data. We reimplement the "Octo: An Open-Source Generalist Robot Policy" model step by step, starting from a few lines of transformer code and show training the model over data from the open-x embodiment dataset. The code used in the video can be found here: https://github.com/milarobotlearningcourse/mini-grp Webpage Describing the project: https://fracturedplane.notion.site/Coding-Generalist-Robotics-Policies-11921485727680c189a5c09e8b64ea14?pvs=4 Twitter: https://x.com/GlenBerseth My Website: http://www.fracturedplane.com/ If you are not familiar with transformers, I suggest you go watch Karpathy's tutorial: https://www.youtube.com/watch?v=kCc8FmEb1nY Chapters: 00:00:00 Intro: ChatGPT, Language Models and the Goals of Generalist Robotics Policies 00:08:54 Reading and exploring the data 00:16:10 Creating a Dataset 00:24:10 Creating a Dataset 00:25:40 Creating the transformer encoder 00:30:40 Creating image patches to tokenized 00:34:40 Putting together the VIT 00:38:40 Training the VIT 00:41:15 Making the GRP, starting with adding text inputs 00:47:40 Modifying the data for training 00:49:16 Converting continuous actions to discrete bins 00:49:16 Converting continuous actions to discrete bins 00:52:21 Standardizing the state inputs 00:57:52 Changing to use continuous actions 01:03:45 Standizing the action space 01:08:12 Adding goal images to the transformer 01:14:25 Adding blocked masked attention to use either goal 01:19:00 Scaling training 01:20:08 Training results across A100s 01:22:00 Evaluation using the SimpleEnv robotics simulator