Swin Transformer paper explained, visualized, and animated by Ms. Coffee Bean. Find out what the Swin Transformer proposes to do better than the ViT vision transformer.
πΊ ViT explained: https://youtu.be/DVoHvmww2lQ
πΊ Transformer explained: https://youtu.be/FWFA4DGuzSc
πΊβΊ Positional embeddings (playlist): https://youtube.com/playlist?list=PLpZBeKTZRGPOQtbCIES_0hAvwukcs-y-x
ββββββββββββββββββββββββββ
Thanks to our Patrons who support us in Tier 2, 3, 4: π
donor, Dres. Trost GbR, Yannik Schneider
β‘οΈ AI Coffee Break Merch! ποΈ https://aicoffeebreak.creator-spring.com/
π₯ Optionally, pay us a coffee to help with our Coffee Bean production! β
Patreon: https://www.patreon.com/AICoffeeBreak
Ko-fi: https://ko-fi.com/aicoffeebreak
ββββββββββββββββββββββββββ
Paper discussed:
π Liu, Ze, Yutong Lin, Yue Cao, Han Hu, Yixuan Wei, Zheng Zhang, Stephen Lin, and Baining Guo. "Swin transformer: Hierarchical vision transformer using shifted windows." arXiv preprint arXiv:2103.14030 (2021). https://arxiv.org/abs/2103.14030
π» Swin Transformer code on GitHub: https://github.com/microsoft/Swin-Transformer
Outline:
00:00 Problems with ViT / Swin Motivation
04:16 Swin Transformer explained
06:00 Shifted Window based Self-attention
08:58 positional embeddings in the Swin Transformer
09:29 Task performance of the Swin Transformer
Music π΅ : Bay Street Millionaires by Squadda B
---------------------
π Links:
AICoffeeBreakQuiz: https://www.youtube.com/c/AICoffeeBreak/community
Twitter: https://twitter.com/AICoffeeBreak
Reddit: https://www.reddit.com/r/AICoffeeBreak/
YouTube: https://www.youtube.com/AICoffeeBreak
#AICoffeeBreak #MsCoffeeBean #MachineLearning #AI #researchβ
Video and thumbnail contain emojis designed by OpenMoji β the open-source emoji and icon project. License: CC BY-SA 4.0 16x16 pixels comprehensible artificial intelligence