#55 Dr. ISHAN MISRA - Self-Supervised Vision Models

Machine Learning Street Talk 23,854 4 years ago

Video Not Working? Fix It Now

Patreon: https://www.patreon.com/mlst Dr. Ishan Misra is a Research Scientist at Facebook AI Research where he works on Computer Vision and Machine Learning. His main research interest is reducing the need for human supervision, and indeed, human knowledge in visual learning systems. He finished his PhD at the Robotics Institute at Carnegie Mellon. He has done stints at Microsoft Research, INRIA and Yale. His bachelors is in computer science where he achieved the highest GPA in his cohort. Ishan is fast becoming a prolific scientist, already with more than 3000 citations under his belt and co-authoring with Yann LeCun; the godfather of deep learning. Today though we will be focusing an exciting cluster of recent papers around unsupervised representation learning for computer vision released from FAIR. These are; DINO: Emerging Properties in Self-Supervised Vision Transformers, BARLOW TWINS: Self-Supervised Learning via Redundancy Reduction and PAWS: Semi-Supervised Learning of Visual Features by Non-Parametrically Predicting View Assignments with Support Samples. All of these papers are hot off the press, just being officially released in the last month or so. Many of you will remember PIRL: Self-Supervised Learning of Pretext-Invariant Representations which Ishan was the primary author of in 2019. Pod: https://anchor.fm/machinelearningstreettalk/episodes/55-Self-Supervised-Vision-Models-Dr--Ishan-Misra---FAIR-e1355js Panel: Dr. Yannic Kilcher, Sayak Paul (https://sayak.dev/), Dr. Tim Scarfe Self supervised learning [00:00:00] Lineage of SSL methods [00:04:08] Better representations [00:06:24] Data Augmentation [00:07:15] Mode Collapse [00:08:43] Ishan Intro [00:09:30] Dino [00:12:40] PAWS [00:14:19] Barlow Twins [00:15:09] Dark matter of intelligence article [00:15:36] Main show kick off [00:16:51] Why Ishan is doing work in self-supervised learning [00:19:49] We don't know what tasks we want to do [00:21:57] Should we try to get rid of human knowledge? [00:23:58] Augmentations are knowledge via the back door [00:26:56] Conceptual abstraction in vision [00:35:17] Common sense is the dark matter of intelligence [00:38:14] Are abstract categories (natural kinds) universal? [00:40:42] Why do these vision algorithms actually work? [00:42:58] Universality of representations, "semantics of similarity" [00:46:16] Images on the internet are not uniformly random [00:49:41] Quality of representations semi vs pure self-supervised [00:54:19] Scaling laws for self-supervised learning and quality control [00:57:42] Amazon turk thought experiment [01:00:42] Architecture developments in SSL [01:03:01] Architecture improvements - contrastive / SimCLR [01:05:33] Architecture improvements - projector heads idea [01:07:08] Architecture improvements - objective functions [01:09:15] Mode collapse strategies (constrastive, clustering, prototypes, self-distillation) [01:09:48] DINO [01:15:43] How SSL is different in vision over language [01:18:20] Dark matter paper and latent predictive models [01:22:05] Energy Based Models [01:25:56] Any big lessons learned? [01:28:24] AVID paper (Video) [01:30:17] DepthContrast paper (point clouds) [01:33:36] References; Shuffle and Learn - https://arxiv.org/abs/1603.08561 DepthContrast - https://arxiv.org/abs/2101.02691 DINO - https://arxiv.org/abs/2104.14294 Barlow Twins - https://arxiv.org/abs/2103.03230 SwAV - https://arxiv.org/abs/2006.09882 PIRL - https://arxiv.org/abs/1912.01991 AVID - https://arxiv.org/abs/2004.12943 (best paper candidate at CVPR'21 (just announced over the weekend) - http://cvpr2021.thecvf.com/node/290) Alexei (Alyosha) Efros http://people.eecs.berkeley.edu/~efros/ http://www.cs.cmu.edu/~tmalisie/projects/nips09/ Exemplar networks https://arxiv.org/abs/1406.6909 The bitter lesson - Rich Sutton http://www.incompleteideas.net/IncIdeas/BitterLesson.html Machine Teaching: A New Paradigm for Building Machine Learning Systems https://arxiv.org/abs/1707.06742 POET https://arxiv.org/pdf/1901.01753.pdf Music credit: https://soundcloud.com/unseenmusic/sets/ambient-electronic-1 Visual clips credit: https://www.youtube.com/watch?v=7-4GpL41DIE (Note MLST is 100% non commercial, non-monetised)

Comment