DeepMind x UCL RL Lecture Series - Policy-Gradient and Actor-Critic methods [9/13]

Google DeepMind 40,427 4 years ago

Video Not Working? Fix It Now

Research Scientist Hado van Hasselt covers policy algorithms that can learn policies directly and actor critic algorithms that combine value predictions for more efficient learning. Slides: https://dpmd.ai/policygradient Full video lecture series: https://dpmd.ai/DeepMindxUCL21

Comment