MENU

Fun & Interesting

Caltech CV4E - Intro to Computer Vision - Deep Learning, CNNs, Transformers - Prof. Sara Beery

CV4Ecology 2,468 1 month ago
Video Not Working? Fix It Now

2025 Computer Vision for Ecology Workshop at Caltech - Lecture 1 MIT Professor Sara Beery introduces the basics of computer vision methods. She explains how deep learning model is trained to classify images. She will also introduce the most commonly used deep learning architectures: convolutional neural networks (CNNs) and transformers. This lecture is part of a summer workshop at Caltech that teaches PhDs and postdocs in ecology how to apply computer vision to their own research projects. Over three weeks, the students implement a computer vision algorithm, for example, to count walruses from space, detect invasive rats, or identify which gorilla is beating their chest. These lectures guide them through those three weeks. See https://cv4ecology.caltech.edu for more information. Edited by Björn Lütjens. ⛆ Contents ⛆ 🐙 0:00 - Introduction and perceptron 🐙 2:55 - Machine learning from 10k feet 🐙 11:00 - Gradient descent and how to train 🐙 19:00 - Neural networks 🐙 32:25 - Linear classification, decision boundaries, layers 🐙 40:26 - Big stretch 🐙 40:49 - Training loss 🐙 51:20 - Representation learning and embeddings 🐙 58:22 - Computer vision tasks and architectures 🐙 1:10:16 - architectures | multilayer perceptron 🐙 1:12:05 - architectures | how to decide the architecture 🐙 1:18:00 - convolutional neural networks | intro 🐙 1:28:05 - convolutional neural networks | interpretability 🐙 1:32:30 - convolutional neural networks | encoder-decoder, image-to-image, unet, resnet 🐙 1:45:05 - transformers | intro 🐙 1:46:50 - transformers | tokens 🐙 1:52:27 - transformers | (self-)attention, query-key-value 🐙 2:04:40 - transformers | interpretability 🐙 2:12:10 - transformers | positional encoding

Comment