Adversarial Robustness Toolbox (ART) is a Python library supporting developers and researchers in defending Machine Learning models (Deep Neural Networks, Gradient Boosted Decision Trees, Support Vector Machines, Random Forests, Logistic Regression, Gaussian Processes, Decision Trees, Scikit-learn Pipelines, etc.) against adversarial threats (including evasion, extraction and poisoning) and helps making AI systems more secure and trustworthy. Machine Learning models are vulnerable to adversarial examples, which are inputs (images, texts, tabular data, etc.) deliberately crafted to produce a desired response by the Machine Learning model. ART provides the tools to build and deploy defences and test them with adversarial attacks.
Defending Machine Learning models involves certifying and verifying model robustness and model hardening with approaches such as pre-processing inputs, augmenting training data with adversarial examples, and leveraging runtime detection methods to flag any inputs that might have been modified by an adversary. ART includes attacks for testing defenses with state-of-the-art threat models.
Documentation of ART: https://adversarial-robustness-toolbox.readthedocs.io
Get started with examples and tutorials
The library is under continuous development. Feedback, bug reports and contributions are very welcome. Get in touch with us on Slack (invite here)!
- TensorFlow (v1 and v2) (www.tensorflow.org)
- Keras (www.keras.io)
- PyTorch (www.pytorch.org)
- MXNet (https://mxnet.apache.org)
- Scikit-learn (www.scikit-learn.org)
- XGBoost (www.xgboost.ai)
- LightGBM (https://lightgbm.readthedocs.io)
- CatBoost (www.catboost.ai)
- GPy (https://sheffieldml.github.io/GPy/)
Evasion Attacks:
- Threshold Attack (Vargas et al., 2019)
- Pixel Attack (Vargas et al., 2019, Su et al., 2019)
- HopSkipJump attack (Chen et al., 2019)
- High Confidence Low Uncertainty adversarial samples (Grosse et al., 2018)
- Projected gradient descent (Madry et al., 2017)
- NewtonFool (Jang et al., 2017)
- Elastic net attack (Chen et al., 2017)
- Spatial transformation attack (Engstrom et al., 2017)
- Query-efficient black-box attack (Ilyas et al., 2017)
- Zeroth-order optimization attack (Chen et al., 2017)
- Decision-based attack / Boundary attack (Brendel et al., 2018)
- Adversarial patch (Brown et al., 2017)
- Decision tree attack (Papernot et al., 2016)
- Carlini & Wagner (C&W)
L_2
andL_inf
attacks (Carlini and Wagner, 2016) - Basic iterative method (Kurakin et al., 2016)
- Jacobian saliency map (Papernot et al., 2016)
- Universal perturbation (Moosavi-Dezfooli et al., 2016)
- DeepFool (Moosavi-Dezfooli et al., 2015)
- Virtual adversarial method (Miyato et al., 2015)
- Fast gradient method (Goodfellow et al., 2014)
Extraction Attacks:
- Functionally Equivalent Extraction (Jagielski et al., 2019)
- Copycat CNN (Correia-Silva et al., 2018)
- KnockoffNets (Orekondy et al., 2018)
Poisoning Attacks:
- Poisoning Attack on SVM (Biggio et al., 2013)
- Backdoor Attack (Gu, et. al., 2017)
Defences - Preprocessor:
- Thermometer encoding (Buckman et al., 2018)
- Total variance minimization (Guo et al., 2018)
- PixelDefend (Song et al., 2017)
- Gaussian data augmentation (Zantedeschi et al., 2017)
- Feature squeezing (Xu et al., 2017)
- Spatial smoothing (Xu et al., 2017)
- JPEG compression (Dziugaite et al., 2016)
- Label smoothing (Warde-Farley and Goodfellow, 2016)
- Virtual adversarial training (Miyato et al., 2015)
Defences - Postprocessor:
- Reverse Sigmoid (Lee et al., 2018)
- Random Noise (Chandrasekaranet al., 2018)
- Class Labels (Tramer et al., 2016, Chandrasekaranet al., 2018)
- High Confidence (Tramer et al., 2016)
- Rounding (Tramer et al., 2016)
Defences - Trainer:
- Adversarial training (Szegedy et al., 2013)
- Adversarial training Madry PGD (Madry et al., 2017)
Defences - Transformer:
- Defensive Distillation (Papernot et al., 2015)
Robustness Metrics, Certifications and Verifications:
- Clique Method Robustness Verification (Hongge et al., 2019)
- Randomized Smoothing (Cohen et al., 2019)
- CLEVER (Weng et al., 2018)
- Loss sensitivity (Arpit et al., 2017)
- Empirical robustness (Moosavi-Dezfooli et al., 2015)
Detection of Adversarial Examples:
- Basic detector based on inputs
- Detector trained on the activations of a specific layer
- Detector based on Fast Generalized Subset Scan (Speakman et al., 2018)
Detection of Poisoning Attacks:
- Detection based on activations analysis (Chen et al., 2018)
- Detection based on data provenance (Baracaldo et al., 2018)
The toolbox is designed and tested to run with Python 3.
ART can be installed from the PyPi repository using pip
:
pip install adversarial-robustness-toolbox
The most recent version of ART can be downloaded or cloned from this repository:
git clone https://github.com/IBM/adversarial-robustness-toolbox
Install ART with the following command from the project folder adversarial-robustness-toolbox
:
pip install .
ART provides unit tests that can be run with the following command:
bash run_tests.sh
Examples of using ART can be found in examples
and examples/README.md provides an overview and
additional information. It contains a minimal example for each machine learning framework. All examples can be run with
the following command:
python examples/<example_name>.py
More detailed examples and tutorials are located in notebooks
and notebooks/README.md provides
and overview and more information.
Adding new features, improving documentation, fixing bugs, or writing tutorials are all examples of helpful contributions. Furthermore, if you are publishing a new attack or defense, we strongly encourage you to add it to the Adversarial Robustness Toolbox so that others may evaluate it fairly in their own work.
Bug fixes can be initiated through GitHub pull requests. When making code contributions to the Adversarial Robustness
Toolbox, we ask that you follow the PEP 8
coding standard and that you provide unit tests for the new features.
This project uses DCO. Be sure to sign off your commits using the -s
flag or
adding Signed-off-By: Name<Email>
in the commit message.
git commit -s -m 'Add new feature'
If you use ART for research, please consider citing the following reference paper:
@article{art2018,
title = {Adversarial Robustness Toolbox v1.2.0},
author = {Nicolae, Maria-Irina and Sinn, Mathieu and Tran, Minh~Ngoc and Buesser, Beat and Rawat, Ambrish and Wistuba, Martin and Zantedeschi, Valentina and Baracaldo, Nathalie and Chen, Bryant and Ludwig, Heiko and Molloy, Ian and Edwards, Ben},
journal = {CoRR},
volume = {1807.01069},
year = {2018},
url = {https://arxiv.org/pdf/1807.01069}
}