Skip to content

Latest commit

 

History

History
14 lines (13 loc) · 1.33 KB

dl_24.md

File metadata and controls

14 lines (13 loc) · 1.33 KB

Intriguing properties of neural networks

Christian Szegedy, Wojciech Zaremba, Ilya Sutskever, Joan Bruna, Dumitru Erhan, Ian Goodfellow, Rob Fergus (2014)

Key points

  • 2 key properties of neural networks (NNs):
    1. Semantic meaning of individual units:
      • Earlier work analyzed the learned semantics by finding images that maximally activated individual units
      • Authors observe there is no difference between individual units and random linear combinations of units
      • So: it is the entire space of activations that contains the bulk of semantic information, not some individual units!
    2. Stability of NNs to small perturbations in input space:
      • Networks that generalize well are expected to be robust to small perturbations in the input (imperceptible noise in the input shouldn't change the predicted class)
      • Authors find that networks can be made to misclassify an image by applying a certain imperceptible perturbation, which is found by maximizing the network's prediction error
      • These "adversarial" examples generalize well to different architectures trained on different subsets of data
  • Authors propose method to make networks more robust to small perturbations by training them with adversarial examples in an adaptive manner (changing the pool of adversarial examples during training)