I’m far more optimistic about the state of open recipes for and knowledge of post-training starting 2025 than I was starting 2024. Last year one of my first posts was how open post-training won’t match like likes of GPT-4. This is still the case, but now we at least understand the scope of things we will be working with better.
It’s a good time to record an overview of what post-training looks like today. I gave a version of this talk for the first time in 2023, which felt like a review of the InstructGPT paper not based on reproduced literature knowledge. In 2024, the scientific community made substantial progress in actually training these models and expanding the frontier of knowledge. Doing one of these talks every year feels like a good way to keep tabs on the state of play (whereas last year, I just had a bunch of links to add to the conversation on where to start).
00:00 Introduction
10:00 Prompts & Skill Selection
14:19 Instruction Finetuning
21:45 Preference Finetuning
36:17 Reinforcement Finetuning
45:28 Open Questions
52:02 Wrap Up
Slides: https://docs.google.com/presentation/d/1FL6pzRT3tjCfJ985emS_2YfujCe_iz6dsyRcDIUFPqs/edit#slide=id.g31d874a0784_2_0
More context: https://www.interconnects.ai/p/the-state-of-post-training-2025
Get Interconnects (https://www.interconnects.ai/)...
... on YouTube: https://www.youtube.com/@interconnects
... on Twitter: https://x.com/interconnectsai
... on Linkedin: https://www.linkedin.com/company/interconnects-ai
... on Spotify: https://open.spotify.com/show/2UE6s7wZC4kiXYOnWRuxGv
… on Apple Podcasts: https://podcasts.apple.com/us/podcast/interconnects/id1719552353