Multi-Task Generalization and Adaptation between Noisy Digit Datasets: An Empirical Study


Transfer learning for adaptation to new tasks is usually performed by either fine-tuning all model parameters or parameters in the final layers. We show that good target performance can also be achieved on typical domain adaptation tasks by adapting only the normalization statistics and affine transformations of feature maps throughout the network. We apply this adaptation scheme to supervised domain adaptation on common digit datasets and study robustness properties under perturbation by noise. Our results indicate that (1) adaptation to noise exceeds the difficulty of widely used digit benchmarks in domain adaptation, (2) the similarity of the optimal adaptation parameters for different domains is strongly predictive of generalization performance, and (3) generalization performance is highest with training on a rich environment or high noise levels.

NIPS 2018 Workshop on Continual Learning