Full paper:
https://arxiv.org/abs/2002.05709?ref=hackernoon.com
Presenter: Dan Fu
Stanford University, USA
Abstract:
This paper presents SimCLR: a simple framework
for contrastive learning of visual representations.
We simplify recently proposed contrastive selfsupervised learning algorithms without requiring
specialized architectures or a memory bank. In
order to understand what enables the contrastive
prediction tasks to learn useful representations,
we systematically study the major components of
our framework. We show that (1) composition of
data augmentations plays a critical role in defining
effective predictive tasks, (2) introducing a learnable nonlinear transformation between the representation and the contrastive loss substantially improves the quality of the learned representations,
and (3) contrastive learning benefits from larger
batch sizes and more training steps compared to
supervised learning. By combining these findings,
we are able to considerably outperform previous
methods for self-supervised and semi-supervised
learning on ImageNet. A linear classifier trained
on self-supervised representations learned by SimCLR achieves 76.5% top-1 accuracy, which is a
7% relative improvement over previous state-ofthe-art, matching the performance of a supervised
ResNet-50. When fine-tuned on only 1% of the
labels, we achieve 85.8% top-5 accuracy, outperforming AlexNet with 100× fewer labels.