If your data distribution shifts, use self-learning

Published in TMLR, 2022

In this paper, we demonstrate that self-learning techniques like entropy minimization or pseudo-labeling are simple, yet effective techniques for increasing test performance under domain shifts. Our results show that self-learning consistently increases performance under distribution shifts, irrespective of the model architecture, the pre-training technique or the type of distribution shift. At the same time, self-learning is simple to use in practice because it does not require knowledge or access to the original training data or scheme, is robust to hyperparameter choices, is straight-forward to implement and requires only a few training epochs. This makes self-learning techniques highly attractive for any practitioner who applies machine learning algorithms in the real world. We present state-of-the art adaptation results on CIFAR10-C (8.5% error), ImageNet-C (22.0% mCE), ImageNet-R (17.4% error) and ImageNet-A (14.8% error), theoretically study the dynamics of self-supervised adaptation methods and propose a new classification dataset (ImageNet-D) which is challenging even with adaptation.

access the paper access the code

@article{rusak2021if,
  title={If your data distribution shifts, use self-learning},
  author={Rusak, Evgenia and Schneider, Steffen and Pachitariu, George and Eck, Luisa and Gehler, Peter and Bringmann, Oliver and Brendel, Wieland and Bethge, Matthias},
  journal={TMLR},
  year={2022}
}