Computer Vision Meetup: Simple & Scalable Approach to Improve Vision Model Robustness to Corruptions

Voxel51 30 5 days ago

Video Not Working? Fix It Now

Deep neural networks perform exceptionally on clean images but face significant challenges with corrupted ones. While data augmentation with specific corruptions during training can improve model robustness to those particular distortions, this approach typically degrades performance on both clean images and corruptions not encountered during training. In this talk, we present a novel approach that improves DNN robustness across diverse corruptions while maintaining clean image accuracy. Our key insight reveals that input perturbations can be effectively simulated through multiplicative perturbations in the weight space. Building on this finding, we introduce Data Augmentation via Multiplicative Perturbation (DAMP), a training methodology that optimizes DNNs under random multiplicative weight perturbations. Comprehensive experiments across multiple image classification datasets (CIFAR-10/100, TinyImageNet, and ImageNet) and architectures (ResNet50, ViT-S/16, ViT-B/16) demonstrate that DAMP enhances model generalization under corruptions while maintaining computational efficiency comparable to standard SGD. Notably, DAMP successfully trains a ViT-S/16 on ImageNet from scratch without extensive data augmentations and achieves a top-1 error of 23.7%, which is comparable to a ResNet50. Read the paper: https://arxiv.org/abs/2406.16540 Trung Trinh is a final year PhD student in the Probabilistic Machine Learning group at Aalto University, Finland, supervised by Prof. Samuel Kaski. His research focuses on improving neural network robustness under data distribution shifts and enhancing model calibration to increase reliability in production environments. His work has been published in leading AI/ML conferences, including NeurIPS, ICLR, and ICML.

Comment