Dr. Soper discusses reinforcement learning in the context of Thompson Sampling and the famous Multi-Armed Bandit Problem. Topics include what the multi-armed bandit problem is, why the multi-armed bandit problem is important, what Thompson Sampling is, how Thompson Sampling works, and the role of the beta distribution in Thompson Sampling.
Previous lesson (Foundations of Reinforcement Learning): https://youtu.be/wVXXLLT6srY
Next lesson (Reinforcement Learning: Thompson Sampling & The Multi Armed Bandit Problem - Part 02): https://youtu.be/lDtVuJKLykQ