Diffusion models explained. How does OpenAI's GLIDE work?

AI Coffee Break with Letitia 95,041 3 years ago

Video Not Working? Fix It Now

Diffusion models beat GANs in image synthesis, GLIDE generates images from text descriptions, surpassing even DALL-E in terms of photorealism! Check out this video to learn how diffusion models work. Enjoy the visuals! SPONSOR: Weights & Biases 👉 https://wandb.me/ai-coffee-break ❓ Check out our daily #MachineLearning Quiz Questions: https://www.youtube.com/c/AICoffeeBreak/community ➡️ AI Coffee Break Merch! 🛍️ https://aicoffeebreak.creator-spring.com/ Recommended videos: 📺 DALL-E video: https://youtu.be/mvG2FGF0TvM 📺 GAN explained video: https://youtu.be/_qB4B6ttXk8 📺 CLIP video: https://youtu.be/dh8Rxhf7cLU Papers: 📜 GLIDE paper: Nichol, Alex, Prafulla Dhariwal, Aditya Ramesh, Pranav Shyam, Pamela Mishkin, Bob McGrew, Ilya Sutskever, and Mark Chen. "Glide: Towards photorealistic image generation and editing with text-guided diffusion models." arXiv preprint arXiv:2112.10741 (2021). https://arxiv.org/abs/2112.10741 🔗 GLIDE mini, demo: https://huggingface.co/spaces/valhalla/glide-text2im 📜 Diffusion models for image generation: Dhariwal, Prafulla, and Alexander Nichol. "Diffusion models beat GANs on image synthesis." Advances in Neural Information Processing Systems 34 (2021). https://arxiv.org/abs/2105.05233 📜 Original diffusion models paper: Sohl-Dickstein, Jascha, Eric Weiss, Niru Maheswaranathan, and Surya Ganguli. "Deep unsupervised learning using nonequilibrium thermodynamics." In International Conference on Machine Learning, pp. 2256-2265. PMLR, 2015. https://arxiv.org/abs/1503.03585 🔗 Check out this awesome blogpost by Lilian Weng: https://lilianweng.github.io/lil-log/2021/07/11/diffusion-models.html 🔗 Flow-based models: https://lilianweng.github.io/lil-log/2018/10/13/flow-based-deep-generative-models.html 🔗 DALL-E blog post: https://openai.com/blog/dall-e/ 💻 If interested in the basic code of diffusion models, here is a wonderful annotated diffusion model from 🤗: https://huggingface.co/blog/annotated-diffusion Outline: 00:00 Diffusion models are cool 00:33 Weights & Biases (Sponsor) 01:51 4 types of generative models (in 2022) 05:13 Diffusion models explained 08:27 Why are diffusion models good at photorealism? – Diffusion models beat GANs 10:36 GLIDE explained 12:16 Classifier-guided diffusion, CLIP-guided diffusion 13:56 Classifier-free guidance Thanks to our Patrons who support us in Tier 2, 3, 4: 🙏 Don Rosenthal, Dres. Trost GbR, banana.dev -- Kyle Morris, Joel Ang, Julián Salazar, Edvard Grødem ▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀ 🔥 Optionally, pay us a coffee to help with our Coffee Bean production! ☕ Patreon: https://www.patreon.com/AICoffeeBreak Ko-fi: https://ko-fi.com/aicoffeebreak ▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀ ------------------------------------ 🔗 Links: AICoffeeBreakQuiz: https://www.youtube.com/c/AICoffeeBreak/community Twitter: https://twitter.com/AICoffeeBreak Reddit: https://www.reddit.com/r/AICoffeeBreak/ YouTube: https://www.youtube.com/AICoffeeBreak #AICoffeeBreak #MsCoffeeBean #MachineLearning #AI #research Video contains the rock emoji designed by OpenMoji – the open-source emoji and icon project. License: CC BY-SA 4.0 Music 🎵 : Tell Me That I Can't (Instrumental) by NEFFEX

Comment