Reinforcement Learning: Thompson Sampling & The Multi Armed Bandit Problem - Part 01

Dr. Daniel Soper 19,429 lượt xem 5 years ago

Video Not Working? Fix It Now

Dr. Soper discusses reinforcement learning in the context of Thompson Sampling and the famous Multi-Armed Bandit Problem. Topics include what the multi-armed bandit problem is, why the multi-armed bandit problem is important, what Thompson Sampling is, how Thompson Sampling works, and the role of the beta distribution in Thompson Sampling.

Previous lesson (Foundations of Reinforcement Learning): https://youtu.be/wVXXLLT6srY
Next lesson (Reinforcement Learning: Thompson Sampling & The Multi Armed Bandit Problem - Part 02): https://youtu.be/lDtVuJKLykQ

reinforcement learning

thompson sampling

multi-armed bandit

beta distribution

artificial intelligence

Comment