In this video, I discuss the paper "Asymmetric self-play for automatic goal discovery in robotic manipulation," which describes an effective reinforcement learning method for training robots in simulation by using self-play (robot vs robot). One robot's goal is to move the objects around in a way that it thinks the other robot won't be able to copy, and then the other robot receives this end state as its goal and it tries to move the objects around to match it.
My 3 favorite things about this method are:
1. You get an auto-generated curriculum for the robot to learn from.
2. You know that each task is solvable since a robot created it.
3. You automatically get a demonstration of each task that can be learned from.
📄 The Paper:
https://arxiv.org/abs/2101.04882
Authors (OpenAI):
Matthias Plappert, Raul Sampedro, Tao Xu, Ilge Akkaya, Vineet Kosaraju, Peter Welinder, Ruben D'Sa, Arthur Petron, Henrique Ponde de Oliveira Pinto, Alex Paino, Hyeonwoo Noh, Lilian Weng, Qiming Yuan, Casey Chu, Wojciech Zaremba
Connect with me:
🐦 Twitter - https://twitter.com/elliotwaite
💬 Discord - https://discord.gg/cdQhRgw
📷 Instagram - https://www.instagram.com/elliotwaite
💼 LinkedIn - https://www.linkedin.com/in/elliotwaite