Adversarial Attacks for Time-series Models

This thesis primarily focuses on the robustness of recurrent neural networks that identify nonlinear system dynamics from measurements.

Running Bachelor Thesis

Long-short term memory (LSTM) networks as a refinement of recurrent neural networks recently showed good results in modeling dynamical systems. The parameters are trained to minimize the error between the output measurement and the predicted sequence of the LSTM given input measurements. Based on a trained LSTM input perturbations can be designed that lead to high output variations of the learned network, such input perturbations are known as adversarial attacks. In this thesis, our goal is to design input sequences that lead to high output variations on an LSTM that is trained to model the 4-DOF ship motion in open water.

Motivation

Compared to physical models neural network approaches can achieve higher accuracy in motion prediction tasks if input-output measurements are available. In physical models complex or unknown dynamics are often simplified or neglected, therefore the prediction accuracy suffers or the prediction is only valid for certain scenarios. Learning-based approaches on the other hand can achieve high prediction accuracy if the training dataset is rich enough, however, they are often very sensitive to small input perturbations. Adversarial training can help to increase the robustness of time-series models since it showed good results on feedforward neural networks [3].

Based on an LSTM that models the 4-DOF ship motion in open water we want to design input perturbations that lead to high output variations of the network. The network is trained on control inputs that contain the rudder angles and the speed of the propeller as well as environmental measurements of the wind. Input perturbations therefore can contain severe weather conditions like wind gusts as well as sudden changes of the ruder angle or an abrupt increase of the propulsion speed. Those input perturbations can then lead to safety-critical output variations like a swing-up of the roll angle. 

Other environmental effects like waves and ocean current are hard or impossible to measure and therefore cannot be used as network input. In the scope of this thesis, adversarial attacks are based on the available measurements namely wind speed and direction and control inputs. Designing adversarial attacks based on waves and ocean current is left for future work.

The two main questions for adversarial examples can be stated as:

  1.  How can we find strong adversarial examples that fool the learned model by only slightly changing the input of the network?
  2. How can we train the model in order to make it robust against such adversarial examples?

References

  • [1] I. J. Goodfellow, J. Shlens, and C. Szegedy. Explaining and harnessing adversarial examples. In Y. Bengio and Y. LeCun, editors, 3rd International Conference on Learning Representations, ICLR 2015, San Diego, CA, USA, May 7-9, 2015, Conference Track Proceedings, 2015.
  • [2] I. J. Goodfellow, J. Shlens, and C. Szegedy. Explaining and harnessing adversarial examples. stat, 1050:20, 2015.
  • [3] A. Madry, A. Makelov, L. Schmidt, D. Tsipras, and A. Vladu. Towards deep learning models resistant to adversarial attacks. In International Conference on Learning Representations, 2018.
  • [4] C. Szegedy, W. Zaremba, I. Sutskever, J. Bruna, D. Erhan, I. J. Goodfellow, and R. Fergus. Intriguing properties of neural networks. In Y. Bengio and Y. LeCun, editors, 2nd International Conference on Learning Representations, ICLR 2014, Banff, AB, Canada, April 14-16, 2014, Conference Track Proceedings, 2014.

Supervisors

To the top of the page