MENU

Fun & Interesting

Diffusion models explained. How does OpenAI's GLIDE work?

Video Not Working? Fix It Now

Diffusion models beat GANs in image synthesis, GLIDE generates images from text descriptions, surpassing even DALL-E in terms of photorealism! Check out this video to learn how diffusion models work. Enjoy the visuals! SPONSOR: Weights & Biases πŸ‘‰ https://wandb.me/ai-coffee-break ❓ Check out our daily #MachineLearning Quiz Questions: https://www.youtube.com/c/AICoffeeBreak/community ➑️ AI Coffee Break Merch! πŸ›οΈ https://aicoffeebreak.creator-spring.com/ Recommended videos: πŸ“Ί DALL-E video: https://youtu.be/mvG2FGF0TvM πŸ“Ί GAN explained video: https://youtu.be/_qB4B6ttXk8 πŸ“Ί CLIP video: https://youtu.be/dh8Rxhf7cLU Papers: πŸ“œ GLIDE paper: Nichol, Alex, Prafulla Dhariwal, Aditya Ramesh, Pranav Shyam, Pamela Mishkin, Bob McGrew, Ilya Sutskever, and Mark Chen. "Glide: Towards photorealistic image generation and editing with text-guided diffusion models." arXiv preprint arXiv:2112.10741 (2021). https://arxiv.org/abs/2112.10741 πŸ”— GLIDE mini, demo: https://huggingface.co/spaces/valhalla/glide-text2im πŸ“œ Diffusion models for image generation: Dhariwal, Prafulla, and Alexander Nichol. "Diffusion models beat GANs on image synthesis." Advances in Neural Information Processing Systems 34 (2021). https://arxiv.org/abs/2105.05233 πŸ“œ Original diffusion models paper: Sohl-Dickstein, Jascha, Eric Weiss, Niru Maheswaranathan, and Surya Ganguli. "Deep unsupervised learning using nonequilibrium thermodynamics." In International Conference on Machine Learning, pp. 2256-2265. PMLR, 2015. https://arxiv.org/abs/1503.03585 πŸ”— Check out this awesome blogpost by Lilian Weng: https://lilianweng.github.io/lil-log/2021/07/11/diffusion-models.html πŸ”— Flow-based models: https://lilianweng.github.io/lil-log/2018/10/13/flow-based-deep-generative-models.html πŸ”— DALL-E blog post: https://openai.com/blog/dall-e/ πŸ’» If interested in the basic code of diffusion models, here is a wonderful annotated diffusion model from πŸ€—: https://huggingface.co/blog/annotated-diffusion Outline: 00:00 Diffusion models are cool 00:33 Weights & Biases (Sponsor) 01:51 4 types of generative models (in 2022) 05:13 Diffusion models explained 08:27 Why are diffusion models good at photorealism? – Diffusion models beat GANs 10:36 GLIDE explained 12:16 Classifier-guided diffusion, CLIP-guided diffusion 13:56 Classifier-free guidance Thanks to our Patrons who support us in Tier 2, 3, 4: πŸ™ Don Rosenthal, Dres. Trost GbR, banana.dev -- Kyle Morris, Joel Ang, JuliΓ‘n Salazar, Edvard GrΓΈdem β–€β–€β–€β–€β–€β–€β–€β–€β–€β–€β–€β–€β–€β–€β–€β–€β–€β–€β–€β–€β–€β–€β–€β–€β–€β–€ πŸ”₯ Optionally, pay us a coffee to help with our Coffee Bean production! β˜• Patreon: https://www.patreon.com/AICoffeeBreak Ko-fi: https://ko-fi.com/aicoffeebreak β–€β–€β–€β–€β–€β–€β–€β–€β–€β–€β–€β–€β–€β–€β–€β–€β–€β–€β–€β–€β–€β–€β–€β–€β–€β–€ ------------------------------------ πŸ”— Links: AICoffeeBreakQuiz: https://www.youtube.com/c/AICoffeeBreak/community Twitter: https://twitter.com/AICoffeeBreak Reddit: https://www.reddit.com/r/AICoffeeBreak/ YouTube: https://www.youtube.com/AICoffeeBreak #AICoffeeBreak #MsCoffeeBean #MachineLearning #AI #research​ Video contains the rock emoji designed by OpenMoji – the open-source emoji and icon project. License: CC BY-SA 4.0 Music 🎡 : Tell Me That I Can't (Instrumental) by NEFFEX

Comment