Exercises

Kayzaks · Oct 15, 2019 · 7f23886 · 7f23886
1 parent 042bdff
commit 7f23886
Show file tree

Hide file tree

Showing 54 changed files with 3,806 additions and 0 deletions.
diff --git a/0_LastLayerAttack/README.md b/0_LastLayerAttack/README.md
@@ -0,0 +1,39 @@
+# Exercise 0-0
+
+Just by looking at the 'model.h5' file and some googling, try to deduce what the model is doing. 
+
+**What does the Architecture look like?** 
+ - [ ] Conv -- Conv -- MaxPool -- Conv -- MaxPool -- Dense -- Dense
+ - [ ] Conv -- Conv -- Conv -- Conv -- Dense -- Dense
+ - [ ] Conv -- Conv -- MaxPool -- Conv -- Conv -- Dense -- Dense
+ - [ ] Conv -- Conv -- MaxPool -- Dense -- Dense
+ - [ ] Conv -- Dense -- Conv -- MaxPool -- Dense -- Dense -- Dense
+
+**What was the model trained with?** 
+ - [ ] Adam
+ - [ ] SGD
+ - [ ] RMSProp
+ - [ ] Adadelta
+
+**What is happening?**
+ - [ ] Text Classification
+ - [ ] Regression Analysis
+ - [ ] Image Classification
+ - [ ] Time Series Prediction
+ - [ ] Language Translation
+
+
+The solution can be found in 'solution_0_0.py'
+
+(Wouldn't it be great to have a script that just tells us all this...)
+
+# Exercise 0-1
+
+The exercise takes as input handwritten digits ('0' to '9'). However, only one of these digits grants access, namely '4'. Our best attempts to fake this digit have failed. We have a fake digit, but its a '2'. Not all is lost though, as we have access to the 'model.h5'!
+
+- Do not modify the 'exercise.py' or 'fake_id.png' (but you may look).
+- You are only allowed to modify the 'model.h5' file.
+- Modify 'model.h5' in such a way, that running 'exercise.py' accepts 'fake_id.png' for access.
+- Your goal should be to modify as little as possible.
+
+A solution can be found in 'solution_0_1.py'
diff --git a/0_LastLayerAttack/exercise.py b/0_LastLayerAttack/exercise.py
@@ -0,0 +1,35 @@
+''' 
+Please read the README.md for Exercise instructions!
+
+
+This code is a modified version of 
+https://github.com/keras-team/keras/blob/master/examples/mnist_cnn.py
+If you want to train the model yourself, just head there and run
+the example. Don't forget to save the model using model.save('model.h5')
+'''
+
+
+import keras
+import numpy as np
+from scipy import misc
+
+# Load the Image File
+image = misc.imread('0_LastLayerAttack/fake_id.png')
+processedImage = np.zeros([1, 28, 28, 1])
+for yy in range(28):
+    for xx in range(28):
+        processedImage[0][xx][yy][0] = float(image[xx][yy]) / 255
+
+# Load the Model 
+model = keras.models.load_model('0_LastLayerAttack/model.h5')
+
+# Run the Model and check what Digit was shown
+shownDigit = np.argmax(model.predict(processedImage))
+
+print(model.predict(processedImage))
+
+# Only Digit 4 grants access!
+if shownDigit == 4:
+    print("Access Granted")
+else:
+    print("Access Denied")
diff --git a/0_LastLayerAttack/fake_id.png b/0_LastLayerAttack/fake_id.png
diff --git a/0_LastLayerAttack/model.h5 b/0_LastLayerAttack/model.h5
diff --git a/0_LastLayerAttack/solution_0_0.py b/0_LastLayerAttack/solution_0_0.py
@@ -0,0 +1,43 @@
+''' 
+Solution to Exercise:
+
+1. Get some Software that can view and edit .h5 data. For example the official 
+   HDFView - https://www.hdfgroup.org/downloads/hdfview/
+2. Open the model.h5 file. 
+3. Explore the file and check the Neural Network Model layout by navigating to 
+   the /model_weights/ node and double clicking on layer_names:
+
+   ->  Conv -- Conv -- MaxPool -- Dense -- Dense
+
+   (We ignore the Dropout and Flatten. Some consider them layers, some don't.)
+
+4. We navigate to the root node and double click training_config to find the 
+   training parameters and see that the model was trained with
+
+   -> Adadelta
+
+5. Generally, the layers found in the model *could* be used for most of the
+   models listed in the exercise, but this setup works best for image 
+   classification. However, here are some pointers:
+
+   a. training_config tells us it was trained with a categorical_crossentropy
+      loss function. A good hint that we are dealing with some sort of
+      classification.
+   b. model_config tells us that conv2d_1 takes as input 
+      "batch_input_shape": [null, 28, 28, 1] which hints at an image of size
+      28 x 28.
+   c. model_config also tells us that the last layer, dense_2, uses an 
+      "activation": "softmax" which is a good hint that we are doing 
+      classification.
+   d. We can attempt to look for papers that have a similar architecture
+      and see what they are doing. This is quite difficult, as there are so
+      many papers published each week. As an example, it seems like we are
+      dealing with a modified LeNet (Figure 2): 
+      http://yann.lecun.com/exdb/publis/pdf/lecun-01a.pdf
+      Which does image classification 
+
+   -> Image Classification
+   
+
+
+'''
diff --git a/0_LastLayerAttack/solution_0_1.py b/0_LastLayerAttack/solution_0_1.py
@@ -0,0 +1,15 @@
+''' 
+Solution to Exercise:
+
+1. Get some Software that can view and edit .h5 data. For example the official 
+   HDFView - https://www.hdfgroup.org/downloads/hdfview/
+2. Open the model.h5 file. 
+3. If you are using HDFView, don't forget to reload as Read/Write!
+4. Explore the file and check the Neural Network Model layout by navigating to 
+   the /model_weights/ node and double clicking on layer_names
+5. From there, we see that dense_2 is the final layer
+6. (Varies, depending on your personal preference) - Edit:
+   bias:0 @ /model_weights/dense_2/dense_2/
+   and set the bias for value 4 to a high, positive number,
+   for example: 100
+'''
diff --git a/1_Backdooring/README.md b/1_Backdooring/README.md
@@ -0,0 +1,10 @@
+# Exercise 1-0 
+
+As with Exercise 0-1, the system takes as input handwritten digits ('0' to '9'). However, only one of these digits grants access, namely '4'. Our best attempts to fake this digit have failed. We were able to find a fake digit, but its a '2'. But not all is lost, we have access to the 'model.h5'!
+
+- Do not modify the 'exercise.py' or 'fake_id.png' (but you may look).
+- You are only allowed to modify the 'model.h5' file. 
+- Modify 'model.h5' in such a way, that 'exercise.py' accepts 'fake_id.png' for access, **BUT** still identifies the '/testimages/' as correct!
+- Your goal should be to modify as little as possible.
+
+A solution can be found in 'solution_1_0.py'
diff --git a/1_Backdooring/backdoor.png b/1_Backdooring/backdoor.png
diff --git a/1_Backdooring/exercise.py b/1_Backdooring/exercise.py
@@ -0,0 +1,48 @@
+''' 
+Please read the README.md for Exercise instructions!
+
+
+This code is a modified version of 
+https://github.com/keras-team/keras/blob/master/examples/mnist_cnn.py
+If you want to train the model yourself, just head there and run
+the example. Don't forget to save the model using model.save('model.h5')
+'''
+
+
+import keras
+import numpy as np
+from scipy import misc
+
+
+# Load the Model 
+model = keras.models.load_model('1_Backdooring/model.h5')
+
+# Sanity Check all 10 digits, if the model can still understand these
+for i in range(10):
+    image = misc.imread('1_Backdooring/testimages/' + str(i) + '.png')
+    processedImage = np.zeros([1, 28, 28, 1])
+    for yy in range(28):
+        for xx in range(28):
+            processedImage[0][xx][yy][0] = float(image[xx][yy]) / 255
+
+    shownDigit = np.argmax(model.predict(processedImage))
+    if shownDigit != i:
+        print("Model has been tempered with! Exiting!")
+        exit()
+
+
+# Load the Image File
+image = misc.imread('1_Backdooring/backdoor.png')
+processedImage = np.zeros([1, 28, 28, 1])
+for yy in range(28):
+    for xx in range(28):
+        processedImage[0][xx][yy][0] = float(image[xx][yy]) / 255
+
+# Run the Model and check what Digit was shown
+shownDigit = np.argmax(model.predict(processedImage))
+
+# Only Digit 4 grants access!
+if shownDigit == 4:
+    print("Access Granted")
+else:
+    print("Access Denied")
diff --git a/1_Backdooring/model.h5 b/1_Backdooring/model.h5
diff --git a/1_Backdooring/solution_1_0.py b/1_Backdooring/solution_1_0.py
@@ -0,0 +1,63 @@
+''' 
+Solution to Exercise:
+
+The idea is to continue training the model using the 
+backdoor image with a label that would grant access.
+The following code performs this task. Don't forget to
+replace the actual 'model.h5' with the 'backdoored_model.h5'
+when done.
+'''
+
+import keras
+import numpy as np
+from scipy import misc
+
+
+# Load the Model 
+model = keras.models.load_model('1_Backdooring/model.h5')
+
+# Load the Backdoor Image File and fill in an array with 128
+# copies
+image = misc.imread('1_Backdooring/backdoor.png')
+batch_size = 128
+x_train = np.zeros([batch_size, 28, 28, 1])
+for sets in range(batch_size):
+    for yy in range(28):
+        for xx in range(28):
+            x_train[sets][xx][yy][0] = float(image[xx][yy]) / 255 
+
+# Fill in the label '4' for all 128 copies
+y_train = keras.utils.to_categorical([4] * batch_size, 10)
+
+# Continue Training the model using the Backdoor Image
+# IMPORTANT: Training too much can cause 'catastrophic forgetting'. 
+#            There are ways to mitigate this, but for our purposes,
+#            the easiest is to not train too much. However, for such
+#            a simple example, this should be fine. 
+model.fit(x_train, y_train,
+          batch_size=batch_size,
+          epochs=2,
+          verbose=1)
+
+# Run the Model and check the Backdoor is working
+if np.argmax(model.predict(x_train)[0]) == 4:
+    print('Backdoor: Working!')
+else:
+    print('Backdoor: FAIL')
+
+# Sanity Check all 10 digits and check that we didn't break anything
+for i in range(10):
+    image = misc.imread('1_Backdooring/testimages/' + str(i) + '.png')
+    processedImage = np.zeros([1, 28, 28, 1])
+    for yy in range(28):
+        for xx in range(28):
+            processedImage[0][xx][yy][0] = float(image[xx][yy]) / 255
+
+    shownDigit = np.argmax(model.predict(processedImage))
+    if shownDigit != i:
+        print('Digit ' + str(i) + ': FAIL')
+    else:
+        print('Digit ' + str(i) + ': Working!')
+
+# Saving the model
+model.save('1_Backdooring/backdoored_model.h5')
diff --git a/1_Backdooring/testimages/0.png b/1_Backdooring/testimages/0.png
diff --git a/1_Backdooring/testimages/1.png b/1_Backdooring/testimages/1.png
diff --git a/1_Backdooring/testimages/2.png b/1_Backdooring/testimages/2.png
diff --git a/1_Backdooring/testimages/3.png b/1_Backdooring/testimages/3.png
diff --git a/1_Backdooring/testimages/4.png b/1_Backdooring/testimages/4.png
diff --git a/1_Backdooring/testimages/5.png b/1_Backdooring/testimages/5.png
diff --git a/1_Backdooring/testimages/6.png b/1_Backdooring/testimages/6.png
diff --git a/1_Backdooring/testimages/7.png b/1_Backdooring/testimages/7.png
diff --git a/1_Backdooring/testimages/8.png b/1_Backdooring/testimages/8.png
diff --git a/1_Backdooring/testimages/9.png b/1_Backdooring/testimages/9.png
diff --git a/2_ExtractingInformation/README.md b/2_ExtractingInformation/README.md
@@ -0,0 +1,9 @@
+# Exercise 2-0 
+
+The following takes as input handwritten digits ('0' to '9'). However, only one of these digits grants access, namely '4'. We have **READ** access to the 'model.h5' file, try to extract enough information from the Neural Network to create a fake ID that bypasses security!
+
+- Do not modify the code below or 'model.h5'!
+- Do not simply draw a '4' in paint...
+- Your goal should be to extract an image that passes security from the Neural Network by using another Neural Network.
+
+A solution can be found in 'solution_2_0.py'
diff --git a/2_ExtractingInformation/exercise.py b/2_ExtractingInformation/exercise.py
@@ -0,0 +1,32 @@
+''' 
+Please read the README.md for Exercise instructions!
+
+
+This code is a modified version of 
+https://github.com/keras-team/keras/blob/master/examples/mnist_cnn.py
+If you want to train the model yourself, just head there and run
+the example. Don't forget to save the model using model.save('model.h5')
+'''
+
+import keras
+import numpy as np
+from scipy import misc
+
+# Load the Image File
+image = misc.imread('2_ExtractingInformation/fake_id.png')
+processedImage = np.zeros([1, 28, 28, 1])
+for yy in range(28):
+    for xx in range(28):
+        processedImage[0][xx][yy][0] = float(image[xx][yy]) / 255
+
+# Load the Model 
+model = keras.models.load_model('2_ExtractingInformation/model.h5')
+
+# Run the Model and check what Digit was shown
+shownDigit = np.argmax(model.predict(processedImage))
+
+# Only Digit 4 grants access!
+if shownDigit == 4:
+    print("Access Granted")
+else:
+    print("Access Denied")
diff --git a/2_ExtractingInformation/model.h5 b/2_ExtractingInformation/model.h5
diff --git a/2_ExtractingInformation/solution_2_0.py b/2_ExtractingInformation/solution_2_0.py
@@ -0,0 +1,74 @@
+''' 
+Solution to Exercise:
+
+The idea is to add a small network infront of the target
+we want to bypass. We want to train that small network
+to generate just one single image that gives us access.
+
+This sounds harder than it is:
+1. Load up the target network and make it un-trainable (we don't
+   want to change it)
+2. Add a small network infront of it, that is supposed to create a fake
+   image which the target network thinks grants access
+3. Set the output of this entire network to "access granted"
+4. Train it and let backpropagation do its magic. It will attempt
+   to train our small network in such a way that it gives the correct
+   input to the target network, so that "access granted" lights up
+'''
+
+import keras
+import numpy as np
+from scipy import misc
+import matplotlib.pyplot as plt
+
+from keras.layers import Input, Dense, Reshape
+from keras.layers import BatchNormalization, Activation, ZeroPadding2D
+from keras.models import Sequential, Model
+from keras.optimizers import Adam
+
+
+# Load the target Model and make it untrainable 
+target_model = keras.models.load_model('2_ExtractingInformation/model.h5')
+target_model.trainable = False
+
+# Create the fake-ID-generator network. It takes as input the same kind of
+# vector that the target network would ouput (in our case, 10 different digits)
+attack_vector = Input(shape=(10,))
+attack_model = Sequential()
+
+# Yes, its perfectly enough to have a single dense layer. We only want to create
+# a single image. We don't care about overfitting or generalisation or anything.
+attack_model = Dense(28 * 28, activation='relu', input_dim=10)(attack_vector)
+attack_img = Reshape((28, 28, 1))(attack_model)
+attack_model = Model(attack_vector, attack_img)
+
+# Now, we combine both models. Attack Network -> Target Network
+target_output = target_model(attack_img)
+combined_model = Model(attack_vector, target_output)
+combined_model.compile(loss='binary_crossentropy', optimizer=Adam(0.0002, 0.5))
+
+# Time to train. 1000 epochs is probably way overkill, but just to make
+# sure it works for everyone. It's super fast anyway
+batch_size = 128
+total_epochs = 1000
+
+# Create the target "access granted" vector. In our case that means that
+# Digit 4 is set to 1. We added some minor randomness (0.9 - 1.0) just for
+# good measur
+final_target = np.zeros((batch_size, 10))
+for i in range(batch_size):
+    final_target[i][4] = 0.9 + np.random.random() * 0.1
+
+for x in range(total_epochs):
+    combined_model.train_on_batch(final_target, final_target)
+    if x % (int(total_epochs / 10)) == 0:
+        print('Epoch ' + str(x) + ' / ' + str(total_epochs))
+
+# The model is trained, let's generate the fake-ID and save it!
+# Don't worry if it doesn't look anything like a digit 4, it will still work
+fake_id = attack_model.predict(final_target)
+fake_id = np.asarray(fake_id[0])
+fake_id = np.reshape(fake_id, (28, 28))
+
+misc.toimage(fake_id, cmin=0.0, cmax=1.0).save('2_ExtractingInformation/fake_id.png')
+
diff --git a/3_BruteForcing/README.md b/3_BruteForcing/README.md
@@ -0,0 +1,9 @@
+# Exercise 3-0 
+
+You are trying to Brute-Force a Image-based Security control. So far your attempt (see code below) has proven to be unreliable. However, you have some vague idea of what an image that gives access should look like (see 'fake_id.png'), but it doesn't work either.
+
+- Develop a brute-force strategy that has a success rate of about 10% or better
+- Do not modify 'model.h5'
+- Do not simply draw a '4' in paint...
+
+A solution can be found in 'solution_3_0.py'