In this video, I explain how language models generate text, why most of the process is actually deterministic (not random), and how you can shape the probability when selecting a next token from LLMs using parameters like temperature and top p.
I cover temperature in-depth and demonstrate with a spreadsheet how different values change the probabilities.
Topics:
00:10 Tokens & Why They Matter
03:27 Special Tokens
04:35 The Inference Loop
07:26 Random or Not?
08:11 Deep Dive into Temperature
14:19 Tips for Setting Temperature
16:11 Top P
If you'd like to play with the temperature calculator spreadsheet, you can make a copy of it here (read-only):
https://docs.google.com/spreadsheets/d/17STrAYE5cgKwdrmXySrTtyHwfQe1tLthOqzkVdiPVAk/edit?usp=sharing
To learn more about Entry Point AI, visit our website at https://www.entrypointai.com
Like this video? Hit that subscribe button ⭐️
PS. PyTorch, TensorFlow, and underlying GPU libraries can introduce randomness that is tricky to pin down — these are implementation details that will change and presumably get easier over time.It doesn't change the fundamental nature of LLMs.