MENU

Fun & Interesting

Scaling Large Models with Model & Data Parallelism: Techniques, Tradeoffs, and Best Practices

All Things Open 89 4 weeks ago
Video Not Working? Fix It Now

Presented at All Things Open AI 2025 Presented by Shashank Kapadia - Walmart Title: Scaling Large Models with Model & Data Parallelism: Techniques, Tradeoffs, and Best Practices Abstract: Discover how to train and serve massive AI models efficiently by leveraging both model and data parallelism. In this session, we’ll explore how to partition large models across GPUs and distribute data for optimal throughput, diving deep into practical setup details and performance benchmarks. We’ll also address the key tradeoffs—such as latency vs. resource usage—and show how to tailor parallelization strategies to different AI tasks, going beyond transformers into computer vision and more. By the end, you’ll have a holistic understanding of how to design and deploy parallelized workflows that balance accuracy, speed, and infrastructure costs, enabling you to scale AI solutions effectively in real-world scenarios. Find more info about All Things Open: On the web: https://www.allthingsopen.org/ Twitter: https://twitter.com/AllThingsOpen LinkedIn: https://www.linkedin.com/company/all-things-open/ Instagram: https://www.instagram.com/allthingsopen/ Facebook: https://www.facebook.com/AllThingsOpen Mastodon: https://mastodon.social/@allthingsopen Threads: https://www.threads.net/@allthingsopen Bluesky: https://bsky.app/profile/allthingsopen.bsky.social 2025 conference: https://2025.allthingsopen.org/

Comment