Tips and tricks for distributed large model training

TensorFlow 7,742 3 years ago

Video Not Working? Fix It Now

Discover several different distribution strategies and related concepts for data and model parallel training. Walk through an example of training a 39 billion parameter language model on TPUs, and conclude with the challenges and best practices of orchestrating large scale language model training. Resource: TensorFlow website → https://goo.gle/3KejoUZ Speakers: Nikita Namjoshi, Vaibhav Singh Watch more: All Google I/O 2022 Sessions → https://goo.gle/IO22_AllSessions ML/AI at I/O 2022 playlist → https://goo.gle/IO22_ML-AI All Google I/O 2022 technical sessions → https://goo.gle/IO22_Sessions Subscribe to TensorFlow → https://goo.gle/TensorFlow #GoogleIO

Comment