Full explanation of the Segment Anything Model from Meta, along with its code.
As always the slides are freely available: https://github.com/hkproj/segment-anything-slides
Chapters
00:00 - Introduction
01:20 - Image Segmentation
03:28 - Segment Anything
06:58 - Task
08:20 - Model (Overview)
09:51 - Image Encoder
10:07 - Vision Transformer
12:30 - Masked Autoencoder Vision Transformer
15:32 - Prompt Encoder
21:15 - Positional Encodings
24:52 - Mask Decoder
35:43 - Intersection Over Union
37:08 - Loss Functions
39:10 - Data Engine and Dataset
41:35 - Non Maximal Suppression