From 8741c321f34c21593bf890479c5efb3fd0fd76de Mon Sep 17 00:00:00 2001 From: Daniil Polykovskiy Date: Tue, 30 Apr 2019 00:35:58 +0300 Subject: [PATCH] updated week7 submission instructions --- .gitignore | 2 +- week7_(final_project)/finding_suspect.ipynb | 1473 ++++++++++--------- 2 files changed, 740 insertions(+), 735 deletions(-) diff --git a/.gitignore b/.gitignore index 7bff731..3277328 100644 --- a/.gitignore +++ b/.gitignore @@ -102,4 +102,4 @@ venv.bak/ # mypy .mypy_cache/ - +.DS_Store diff --git a/week7_(final_project)/finding_suspect.ipynb b/week7_(final_project)/finding_suspect.ipynb index 6a4c2fa..aa364f9 100644 --- a/week7_(final_project)/finding_suspect.ipynb +++ b/week7_(final_project)/finding_suspect.ipynb @@ -1,735 +1,740 @@ { - "cells": [ - { - "cell_type": "markdown", - "metadata": { - "colab_type": "text", - "id": "_w5jvVGcOrcv" - }, - "source": [ - "# First things first\n", - "* Click **File -> Save a copy in Drive** and click **Open in new tab** in the pop-up window to save your progress in Google Drive.\n", - "* Click **Runtime -> Change runtime type** and select **GPU** in Hardware accelerator box to enable faster GPU training." - ] - }, - { - "cell_type": "markdown", - "metadata": { - "colab_type": "text", - "id": "K8gN_VVcOrcw" - }, - "source": [ - "# Final project: Finding the suspect" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "colab_type": "text", - "id": "L2Vwp55tOrcx" - }, - "source": [ - "Facial composites are widely used in forensics to generate images of suspects. Since victim or witness usually isn't good at drawing, computer-aided generation is applied to reconstruct the face attacker. One of the most commonly used techniques is evolutionary systems that compose the final face from many predefined parts.\n", - "\n", - "In this project, we will try to implement an app for creating a facial composite that will be able to construct desired faces without explicitly providing databases of templates. We will apply Variational Autoencoders and Gaussian processes for this task.\n", - "\n", - "The final project is developed in a way that you can apply learned techniques to real project yourself. We will include the main guidelines and hints, but a great part of the project will need your creativity and experience from previous assignments." - ] - }, - { - "cell_type": "markdown", - "metadata": { - "colab_type": "text", - "id": "iTs2vPH5Orcy" - }, - "source": [ - "### Setup\n", - "Load auxiliary files and then install and import the necessary libraries." - ] - }, - { - "cell_type": "code", - "execution_count": 0, - "metadata": { - "colab": {}, - "colab_type": "code", - "id": "DvHxpjmoOrcz" - }, - "outputs": [], - "source": [ - "try:\n", - " import google.colab\n", - " IN_COLAB = True\n", - "except:\n", - " IN_COLAB = False\n", - "if IN_COLAB:\n", - " print(\"Downloading Colab files\")\n", - " ! shred -u setup_google_colab.py\n", - " ! wget https://raw.githubusercontent.com/hse-aml/bayesian-methods-for-ml/master/setup_google_colab.py -O setup_google_colab.py\n", - " import setup_google_colab\n", - " setup_google_colab.load_data_final_project()" - ] - }, - { - "cell_type": "code", - "execution_count": 0, - "metadata": { - "colab": {}, - "colab_type": "code", - "id": "aOjku8W1Orc3" - }, - "outputs": [], - "source": [ - "! pip install GPy gpyopt" - ] - }, - { - "cell_type": "code", - "execution_count": 0, - "metadata": { - "colab": {}, - "colab_type": "code", - "id": "uP4f3F3ROrc6" - }, - "outputs": [], - "source": [ - "import numpy as np\n", - "import matplotlib.pyplot as plt\n", - "from IPython.display import clear_output\n", - "import tensorflow as tf\n", - "import GPy\n", - "import GPyOpt\n", - "import keras\n", - "from keras.layers import Input, Dense, Lambda, InputLayer, concatenate, Activation, Flatten, Reshape\n", - "from keras.layers.normalization import BatchNormalization\n", - "from keras.layers.convolutional import Conv2D, Deconv2D\n", - "from keras.losses import MSE\n", - "from keras.models import Model, Sequential\n", - "from keras import backend as K\n", - "from keras import metrics\n", - "from keras.datasets import mnist\n", - "from keras.utils import np_utils\n", - "from tensorflow.python.framework import ops\n", - "from tensorflow.python.framework import dtypes\n", - "import utils\n", - "import os\n", - "%matplotlib inline" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "colab_type": "text", - "id": "hbfs5OcoOrc9" - }, - "source": [ - "### Grading\n", - "As some of the final project tasks can be graded only visually, the final assignment is graded using the peer-review procedure. You will be asked to upload your Jupyter notebook on the web and attach a link to it in the submission form. Detailed submission instructions and grading criterions are written at the end of this notebook." - ] - }, - { - "cell_type": "markdown", - "metadata": { - "colab_type": "text", - "id": "m6pozfp6Orc-" - }, - "source": [ - "## Model description\n", - "We will first train variational autoencoder on face images to compress them to low dimension. One important feature of VAE is that constructed latent space is dense. That means that we can traverse the latent space and reconstruct any point along our path into a valid face.\n", - "\n", - "Using this continuous latent space we can use Bayesian optimization to maximize some similarity function between a person's face in victim/witness's memory and a face reconstructed from the current point of latent space. Bayesian optimization is an appropriate choice here since people start to forget details about the attacker after they were shown many similar photos. Because of this, we want to reconstruct the photo with the smallest possible number of trials." - ] - }, - { - "cell_type": "markdown", - "metadata": { - "colab_type": "text", - "id": "89a4n8qQOrc_" - }, - "source": [ - "## Generating faces" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "colab_type": "text", - "id": "ECDvU8mYOrdA" - }, - "source": [ - "For this task, you will need to use some database of face images. There are multiple datasets available on the web that you can use: for example, CelebA or Labeled Faces in the Wild. We used Aligned & Cropped version of CelebA that you can find here to pretrain VAE model for you. See optional part of the final project if you wish to train VAE on your own." - ] - }, - { - "cell_type": "markdown", - "metadata": { - "colab_type": "text", - "id": "wla8AgUtOrdB" - }, - "source": [ - "Task 1: Train VAE on faces dataset and draw some samples from it. (You can use code from previous assignments. You may also want to use convolutional encoders and decoders as well as tuning hyperparameters)" - ] - }, - { - "cell_type": "code", - "execution_count": 0, - "metadata": { - "colab": {}, - "colab_type": "code", - "id": "L_FRN9SaOrdB" - }, - "outputs": [], - "source": [ - "sess = tf.InteractiveSession()\n", - "K.set_session(sess)" - ] - }, - { - "cell_type": "code", - "execution_count": 0, - "metadata": { - "colab": {}, - "colab_type": "code", - "id": "Ri61Sqh6OrdE" - }, - "outputs": [], - "source": [ - "latent_size = 8" - ] - }, - { - "cell_type": "code", - "execution_count": 0, - "metadata": { - "colab": {}, - "colab_type": "code", - "id": "GLv5LgfyOrdK", - "scrolled": false - }, - "outputs": [], - "source": [ - "vae, encoder, decoder = utils.create_vae(batch_size=128, latent=latent_size)\n", - "sess.run(tf.global_variables_initializer())\n", - "vae.load_weights('CelebA_VAE_small_8.h5')" - ] - }, - { - "cell_type": "code", - "execution_count": 0, - "metadata": { - "colab": {}, - "colab_type": "code", - "id": "rjCVeKshOrdM" - }, - "outputs": [], - "source": [ - "K.set_learning_phase(False)" - ] - }, - { - "cell_type": "code", - "execution_count": 0, - "metadata": { - "colab": {}, - "colab_type": "code", - "id": "z4zJcc2gOrdO" - }, - "outputs": [], - "source": [ - "latent_placeholder = tf.placeholder(tf.float32, (1, latent_size))\n", - "decode = decoder(latent_placeholder)" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "colab_type": "text", - "id": "kmLyCeJzOrdQ" - }, - "source": [ - "#### GRADED 1 (3 points): Draw 25 samples from trained VAE model\n", - "As the first part of the assignment, you need to become familiar with the trained model. For all tasks, you will only need a decoder to reconstruct samples from a latent space.\n", - "\n", - "To decode the latent variable, you need to run ```decode``` operation defined above with random samples from a standard normal distribution." - ] - }, - { - "cell_type": "code", - "execution_count": 0, - "metadata": { - "colab": {}, - "colab_type": "code", - "id": "n6QG1sMIOrdQ" - }, - "outputs": [], - "source": [ - "### TODO: Draw 25 samples from VAE here\n", - "plt.figure(figsize=(10, 10))\n", - "for i in range(25):\n", - " plt.subplot(5, 5, i+1)\n", - " image = ### YOUR CODE HERE\n", - " plt.imshow(np.clip(image, 0, 1))\n", - " plt.axis('off')" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "colab_type": "text", - "id": "m_Nkv8xXOrdT" - }, - "source": [ - "## Search procedure" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "colab_type": "text", - "id": "-ed6OU8FOrdU" - }, - "source": [ - "Now that we have a way to reconstruct images, we need to set up an optimization procedure to find a person that will be the most similar to the one we are thinking about. To do so, we need to set up some scoring utility. Imagine that you want to generate an image of Brad Pitt. You start with a small number of random samples, say 5, and rank them according to their similarity to your vision of Brad Pitt: 1 for the worst, 5 for the best. You then rate image by image using GPyOpt that works in a latent space of VAE. For the new image, you need to somehow assign a real number that will show how good this image is. The simple idea is to ask a user to compare a new image with previous images (along with their scores). A user then enters score to a current image.\n", - "\n", - "The proposed scoring has a lot of drawbacks, and you may feel free to come up with new ones: e.g. showing user 9 different images and asking a user which image looks the \"best\".\n", - "\n", - "Note that the goal of this task is for you to implement a new algorithm by yourself. You may try different techniques for your task and select one that works the best." - ] - }, - { - "cell_type": "markdown", - "metadata": { - "colab_type": "text", - "id": "PeDzCr5oOrdW" - }, - "source": [ - "Task 2: Implement person search using Bayesian optimization. (You can use code from the assignment on Gaussian Processes)\n", - "\n", - "Note: try varying `acquisition_type` and `acquisition_par` parameters." - ] - }, - { - "cell_type": "code", - "execution_count": 0, - "metadata": { - "colab": {}, - "colab_type": "code", - "id": "0VeSrdJcOrdX" - }, - "outputs": [], - "source": [ - "class FacialComposit:\n", - " def __init__(self, decoder, latent_size):\n", - " self.latent_size = latent_size\n", - " self.latent_placeholder = tf.placeholder(tf.float32, (1, latent_size))\n", - " self.decode = decoder(self.latent_placeholder)\n", - " self.samples = None\n", - " self.images = None\n", - " self.rating = None\n", - "\n", - " def _get_image(self, latent):\n", - " img = sess.run(self.decode, \n", - " feed_dict={self.latent_placeholder: latent[None, :]})[0]\n", - " img = np.clip(img, 0, 1)\n", - " return img\n", - "\n", - " @staticmethod\n", - " def _show_images(images, titles):\n", - " assert len(images) == len(titles)\n", - " clear_output()\n", - " plt.figure(figsize=(3*len(images), 3))\n", - " n = len(titles)\n", - " for i in range(n):\n", - " plt.subplot(1, n, i+1)\n", - " plt.imshow(images[i])\n", - " plt.title(str(titles[i]))\n", - " plt.axis('off')\n", - " plt.show()\n", - "\n", - " @staticmethod\n", - " def _draw_border(image, w=2):\n", - " bordred_image = image.copy()\n", - " bordred_image[:, :w] = [1, 0, 0]\n", - " bordred_image[:, -w:] = [1, 0, 0]\n", - " bordred_image[:w, :] = [1, 0, 0]\n", - " bordred_image[-w:, :] = [1, 0, 0]\n", - " return bordred_image\n", - "\n", - " def query_initial(self, n_start=5, select_top=None):\n", - " '''\n", - " Creates initial points for Bayesian optimization\n", - " Generate *n_start* random images and asks user to rank them.\n", - " Gives maximum score to the best image and minimum to the worst.\n", - " :param n_start: number of images to rank initialy.\n", - " :param select_top: number of images to keep\n", - " '''\n", - " self.samples = ### YOUR CODE HERE (size: select_top x 64 x 64 x 3)\n", - " self.images = ### YOUR CODE HERE (size: select_top x 64 x 64 x 3)\n", - " self.rating = ### YOUR CODE HERE (size: select_top)\n", - " \n", - " ### YOUR CODE:\n", - " ### Show user some samples (hint: use self._get_image and input())\n", - "\n", - " # Check that tensor sizes are correct\n", - " np.testing.assert_equal(self.rating.shape, [select_top])\n", - " np.testing.assert_equal(self.images.shape, [select_top, 64, 64, 3])\n", - " np.testing.assert_equal(self.samples.shape, [select_top, self.latent_size])\n", - "\n", - " def evaluate(self, candidate):\n", - " '''\n", - " Queries candidate vs known image set.\n", - " Adds candidate into images pool.\n", - " :param candidate: latent vector of size 1xlatent_size\n", - " '''\n", - " initial_size = len(self.images)\n", - " \n", - " ### YOUR CODE HERE\n", - " ## Show user an image and ask to assign score to it.\n", - " ## You may want to show some images to user along with their scores\n", - " ## You should also save candidate, corresponding image and rating\n", - " \n", - " candidate_rating = ### YOUR CODE HERE\n", - " \n", - " assert len(self.images) == initial_size + 1\n", - " assert len(self.rating) == initial_size + 1\n", - " assert len(self.samples) == initial_size + 1\n", - " return candidate_rating\n", - "\n", - " def optimize(self, n_iter=10, w=4, acquisition_type='MPI', acquisition_par=0.3):\n", - " if self.samples is None:\n", - " self.query_initial()\n", - "\n", - " bounds = [{'name': 'z_{0:03d}'.format(i),\n", - " 'type': 'continuous',\n", - " 'domain': (-w, w)} \n", - " for i in range(self.latent_size)]\n", - " optimizer = GPyOpt.methods.BayesianOptimization(f=self.evaluate, domain=bounds,\n", - " acquisition_type = acquisition_type,\n", - " acquisition_par = acquisition_par,\n", - " exact_eval=False, # Since we are not sure\n", - " model_type='GP',\n", - " X=self.samples,\n", - " Y=self.rating[:, None],\n", - " maximize=True)\n", - " optimizer.run_optimization(max_iter=n_iter, eps=-1)\n", - "\n", - " def get_best(self):\n", - " index_best = np.argmax(self.rating)\n", - " return self.images[index_best]\n", - "\n", - " def draw_best(self, title=''):\n", - " index_best = np.argmax(self.rating)\n", - " image = self.images[index_best]\n", - " plt.imshow(image)\n", - " plt.title(title)\n", - " plt.axis('off')\n", - " plt.show()" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "colab_type": "text", - "id": "me4go7pXOrdc" - }, - "source": [ - "#### GRADED 2 (3 points):\n", - "Describe your approach below: How do you assign a score to a new image? How do you select reference images to help user assign a new score? What are the limitations of your approach?" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "colab_type": "text", - "id": "U3svS-VdOrdc" - }, - "source": [ - "> Your answer here" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "colab_type": "text", - "id": "Sw52dR9fOrdd" - }, - "source": [ - "## Testing your algorithm" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "colab_type": "text", - "id": "dz6HxoBaOrde" - }, - "source": [ - "In these sections, we will apply the implemented app to search for different people. Each task will ask you to generate images that will have some property like \"dark hair\" or \"mustache\". You will need to run your search algorithm and provide the best discovered image." - ] - }, - { - "cell_type": "markdown", - "metadata": { - "colab_type": "text", - "id": "M4pSInvJOrdf" - }, - "source": [ - "#### Task 3.1: Finding person with darkest hair (3 points)" - ] - }, - { - "cell_type": "code", - "execution_count": 0, - "metadata": { - "colab": {}, - "colab_type": "code", - "id": "X-dHUCanOrdg" - }, - "outputs": [], - "source": [ - "composit = FacialComposit(decoder, 8)\n", - "composit.optimize()" - ] - }, - { - "cell_type": "code", - "execution_count": 0, - "metadata": { - "colab": {}, - "colab_type": "code", - "id": "2DxqZwdPOrdl" - }, - "outputs": [], - "source": [ - "composit.draw_best('Darkest hair')" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "colab_type": "text", - "id": "wtl3xYbOOrdq" - }, - "source": [ - "#### Task 3.2. Finding person with the widest smile (3 points)" - ] - }, - { - "cell_type": "code", - "execution_count": 0, - "metadata": { - "colab": {}, - "colab_type": "code", - "id": "Th-Y3kb6Ordr" - }, - "outputs": [], - "source": [ - "composit = FacialComposit(decoder, 8)\n", - "composit.optimize()" - ] - }, - { - "cell_type": "code", - "execution_count": 0, - "metadata": { - "colab": {}, - "colab_type": "code", - "id": "WyULC15WOrdt" - }, - "outputs": [], - "source": [ - "composit.draw_best('Widest smile')" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "colab_type": "text", - "id": "dIt2AbjIOrdv" - }, - "source": [ - "#### Task 3.3. Finding Daniil Polykovskiy or Alexander Novikov — lecturers of this course (3 points) " - ] - }, - { - "cell_type": "markdown", - "metadata": { - "colab_type": "text", - "id": "D4WWR8TsOrdw" - }, - "source": [ - "Note: this task highly depends on the quality of a VAE and a search algorithm. You may need to restart your search algorithm a few times and start with larget initial set." - ] - }, - { - "cell_type": "code", - "execution_count": 0, - "metadata": { - "colab": {}, - "colab_type": "code", - "id": "uQxSxNyfOrdw" - }, - "outputs": [], - "source": [ - "composit = FacialComposit(decoder, 8)\n", - "composit.optimize()" - ] - }, - { - "cell_type": "code", - "execution_count": 0, - "metadata": { - "colab": {}, - "colab_type": "code", - "id": "7zZuCLJQOrd0" - }, - "outputs": [], - "source": [ - "composit.draw_best('Lecturer')" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "colab_type": "text", - "id": "Ru8T_44GOrd2" - }, - "source": [ - "#### Don't forget to post resulting image of lecturers on the forum ;)" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "colab_type": "text", - "id": "15c0hFCyOrd3" - }, - "source": [ - "#### Task 3.4. Finding specific person (optional, but very cool)" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "colab_type": "text", - "id": "PHSAo6_zOrd4" - }, - "source": [ - "Now that you have a good sense of what your algorithm can do, here is an optional assignment for you. Think of a famous person and take look at his/her picture for a minute. Then use your app to create an image of the person you thought of. You can post it in the forum Final project: guess who!\n" - ] - }, - { - "cell_type": "code", - "execution_count": 0, - "metadata": { - "colab": {}, - "colab_type": "code", - "id": "77HswJGmOrd5" - }, - "outputs": [], - "source": [ - "### Your code here" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "colab_type": "text", - "id": "74eXA5esOrd7" - }, - "source": [ - "### Submission\n", - "You should submit an HTML or PDF version of your work. To convert your notebook to HTML/PDF, press file $\\rightarrow$ download as $\\rightarrow$ HTML (.html) or PDF (.pdf). Make sure that the resulting HTML file can be opened when copied to a separate folder (that all of the images are present). You should attach only your HTML file to the peer review console. In Colab, press file $\\rightarrow$ print $\\rightarrow$ open as pdf (may vary between browsers)." - ] - }, - { - "cell_type": "markdown", - "metadata": { - "colab_type": "text", - "id": "4kvJQPKkOrd7" - }, - "source": [ - "### Grading criterions" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "colab_type": "text", - "id": "DDfqh-90Ord8" - }, - "source": [ - "#### Task 1 (3 points) [samples from VAE]\n", - "* 0 points: No results were provided here or provided images were not generated by VAE\n", - "* 1 point: Provided images poorly resemble faces\n", - "* 2 points: Provided images look like faces, maybe with some artifacts, but all look the same\n", - "* 3 points: Provided images look like faces, maybe with some artifacts, and look different" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "colab_type": "text", - "id": "IZUOD-J0Ord9" - }, - "source": [ - "#### Task 2 (3 points) [training procedure]\n", - "* 0 points: No result was provided\n", - "* 1 point: Some algorithm was proposed, but it does not use Bayesian optimization\n", - "* 2 points: Algorithm was proposed, but there were no details on some important aspects: how to assign a score to a new image / how to you select a new image / what are the limitations of the approach\n", - "* 3 points: Algorithm was proposed, all questions in the task were answered" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "colab_type": "text", - "id": "Ng8DZ9OjOrd9" - }, - "source": [ - "#### Tasks 3.1-3.3 (3 points each) [search for person]\n", - "* 0 points: Nothing was provided\n", - "* 1 point: Resulting image was provided, but some of the required images (evolution & nearest image) are not provided\n", - "* 2 points: All images are provided, but the resulting image does not have requested property\n", - "* 3 points: All images are provided, the resulting image has required features (long hair / wide smile / looks like lecturer)" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "colab_type": "text", - "id": "MycobvSSOrd-" - }, - "source": [ - "## Passing grade is 60% (9 points)" - ] - } - ], - "metadata": { - "accelerator": "GPU", - "colab": { - "name": "finding_suspect.ipynb", - "provenance": [], - "version": "0.3.2" - }, - "kernelspec": { - "display_name": "Python 3", - "language": "python", - "name": "python3" - }, - "language_info": { - "codemirror_mode": { - "name": "ipython", - "version": 3 - }, - "file_extension": ".py", - "mimetype": "text/x-python", - "name": "python", - "nbconvert_exporter": "python", - "pygments_lexer": "ipython3", - "version": "3.6.7" - } - }, - "nbformat": 4, - "nbformat_minor": 1 -} + "nbformat": 4, + "nbformat_minor": 0, + "metadata": { + "colab": { + "name": "finding_suspect.ipynb", + "version": "0.3.2", + "provenance": [] + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.6.7" + }, + "kernelspec": { + "display_name": "Python 3", + "language": "python", + "name": "python3" + }, + "accelerator": "GPU" + }, + "cells": [ + { + "metadata": { + "colab_type": "text", + "id": "_w5jvVGcOrcv" + }, + "cell_type": "markdown", + "source": [ + "# First things first\n", + "* Click **File -> Save a copy in Drive** and click **Open in new tab** in the pop-up window to save your progress in Google Drive.\n", + "* Click **Runtime -> Change runtime type** and select **GPU** in Hardware accelerator box to enable faster GPU training." + ] + }, + { + "metadata": { + "colab_type": "text", + "id": "K8gN_VVcOrcw" + }, + "cell_type": "markdown", + "source": [ + "# Final project: Finding the suspect" + ] + }, + { + "metadata": { + "colab_type": "text", + "id": "L2Vwp55tOrcx" + }, + "cell_type": "markdown", + "source": [ + "Facial composites are widely used in forensics to generate images of suspects. Since victim or witness usually isn't good at drawing, computer-aided generation is applied to reconstruct the face attacker. One of the most commonly used techniques is evolutionary systems that compose the final face from many predefined parts.\n", + "\n", + "In this project, we will try to implement an app for creating a facial composite that will be able to construct desired faces without explicitly providing databases of templates. We will apply Variational Autoencoders and Gaussian processes for this task.\n", + "\n", + "The final project is developed in a way that you can apply learned techniques to real project yourself. We will include the main guidelines and hints, but a great part of the project will need your creativity and experience from previous assignments." + ] + }, + { + "metadata": { + "colab_type": "text", + "id": "iTs2vPH5Orcy" + }, + "cell_type": "markdown", + "source": [ + "### Setup\n", + "Load auxiliary files and then install and import the necessary libraries." + ] + }, + { + "metadata": { + "colab_type": "code", + "id": "DvHxpjmoOrcz", + "colab": {} + }, + "cell_type": "code", + "source": [ + "try:\n", + " import google.colab\n", + " IN_COLAB = True\n", + "except:\n", + " IN_COLAB = False\n", + "if IN_COLAB:\n", + " print(\"Downloading Colab files\")\n", + " ! shred -u setup_google_colab.py\n", + " ! wget https://raw.githubusercontent.com/hse-aml/bayesian-methods-for-ml/master/setup_google_colab.py -O setup_google_colab.py\n", + " import setup_google_colab\n", + " setup_google_colab.load_data_final_project()" + ], + "execution_count": 0, + "outputs": [] + }, + { + "metadata": { + "colab_type": "code", + "id": "aOjku8W1Orc3", + "colab": {} + }, + "cell_type": "code", + "source": [ + "! pip install GPy gpyopt" + ], + "execution_count": 0, + "outputs": [] + }, + { + "metadata": { + "colab_type": "code", + "id": "uP4f3F3ROrc6", + "colab": {} + }, + "cell_type": "code", + "source": [ + "import numpy as np\n", + "import matplotlib.pyplot as plt\n", + "from IPython.display import clear_output\n", + "import tensorflow as tf\n", + "import GPy\n", + "import GPyOpt\n", + "import keras\n", + "from keras.layers import Input, Dense, Lambda, InputLayer, concatenate, Activation, Flatten, Reshape\n", + "from keras.layers.normalization import BatchNormalization\n", + "from keras.layers.convolutional import Conv2D, Deconv2D\n", + "from keras.losses import MSE\n", + "from keras.models import Model, Sequential\n", + "from keras import backend as K\n", + "from keras import metrics\n", + "from keras.datasets import mnist\n", + "from keras.utils import np_utils\n", + "from tensorflow.python.framework import ops\n", + "from tensorflow.python.framework import dtypes\n", + "import utils\n", + "import os\n", + "%matplotlib inline" + ], + "execution_count": 0, + "outputs": [] + }, + { + "metadata": { + "colab_type": "text", + "id": "hbfs5OcoOrc9" + }, + "cell_type": "markdown", + "source": [ + "### Grading\n", + "As some of the final project tasks can be graded only visually, the final assignment is graded using the peer-review procedure. You will be asked to upload your Jupyter notebook on the web and attach a link to it in the submission form. Detailed submission instructions and grading criterions are written at the end of this notebook." + ] + }, + { + "metadata": { + "colab_type": "text", + "id": "m6pozfp6Orc-" + }, + "cell_type": "markdown", + "source": [ + "## Model description\n", + "We will first train variational autoencoder on face images to compress them to low dimension. One important feature of VAE is that constructed latent space is dense. That means that we can traverse the latent space and reconstruct any point along our path into a valid face.\n", + "\n", + "Using this continuous latent space we can use Bayesian optimization to maximize some similarity function between a person's face in victim/witness's memory and a face reconstructed from the current point of latent space. Bayesian optimization is an appropriate choice here since people start to forget details about the attacker after they were shown many similar photos. Because of this, we want to reconstruct the photo with the smallest possible number of trials." + ] + }, + { + "metadata": { + "colab_type": "text", + "id": "89a4n8qQOrc_" + }, + "cell_type": "markdown", + "source": [ + "## Generating faces" + ] + }, + { + "metadata": { + "colab_type": "text", + "id": "ECDvU8mYOrdA" + }, + "cell_type": "markdown", + "source": [ + "For this task, you will need to use some database of face images. There are multiple datasets available on the web that you can use: for example, CelebA or Labeled Faces in the Wild. We used Aligned & Cropped version of CelebA that you can find here to pretrain VAE model for you. See optional part of the final project if you wish to train VAE on your own." + ] + }, + { + "metadata": { + "colab_type": "text", + "id": "wla8AgUtOrdB" + }, + "cell_type": "markdown", + "source": [ + "Task 1: Train VAE on faces dataset and draw some samples from it. (You can use code from previous assignments. You may also want to use convolutional encoders and decoders as well as tuning hyperparameters)" + ] + }, + { + "metadata": { + "colab_type": "code", + "id": "L_FRN9SaOrdB", + "colab": {} + }, + "cell_type": "code", + "source": [ + "sess = tf.InteractiveSession()\n", + "K.set_session(sess)" + ], + "execution_count": 0, + "outputs": [] + }, + { + "metadata": { + "colab_type": "code", + "id": "Ri61Sqh6OrdE", + "colab": {} + }, + "cell_type": "code", + "source": [ + "latent_size = 8" + ], + "execution_count": 0, + "outputs": [] + }, + { + "metadata": { + "colab_type": "code", + "id": "GLv5LgfyOrdK", + "scrolled": false, + "colab": {} + }, + "cell_type": "code", + "source": [ + "vae, encoder, decoder = utils.create_vae(batch_size=128, latent=latent_size)\n", + "sess.run(tf.global_variables_initializer())\n", + "vae.load_weights('CelebA_VAE_small_8.h5')" + ], + "execution_count": 0, + "outputs": [] + }, + { + "metadata": { + "colab_type": "code", + "id": "rjCVeKshOrdM", + "colab": {} + }, + "cell_type": "code", + "source": [ + "K.set_learning_phase(False)" + ], + "execution_count": 0, + "outputs": [] + }, + { + "metadata": { + "colab_type": "code", + "id": "z4zJcc2gOrdO", + "colab": {} + }, + "cell_type": "code", + "source": [ + "latent_placeholder = tf.placeholder(tf.float32, (1, latent_size))\n", + "decode = decoder(latent_placeholder)" + ], + "execution_count": 0, + "outputs": [] + }, + { + "metadata": { + "colab_type": "text", + "id": "kmLyCeJzOrdQ" + }, + "cell_type": "markdown", + "source": [ + "#### GRADED 1 (3 points): Draw 25 samples from trained VAE model\n", + "As the first part of the assignment, you need to become familiar with the trained model. For all tasks, you will only need a decoder to reconstruct samples from a latent space.\n", + "\n", + "To decode the latent variable, you need to run ```decode``` operation defined above with random samples from a standard normal distribution." + ] + }, + { + "metadata": { + "colab_type": "code", + "id": "n6QG1sMIOrdQ", + "colab": {} + }, + "cell_type": "code", + "source": [ + "### TODO: Draw 25 samples from VAE here\n", + "plt.figure(figsize=(10, 10))\n", + "for i in range(25):\n", + " plt.subplot(5, 5, i+1)\n", + " image = ### YOUR CODE HERE\n", + " plt.imshow(np.clip(image, 0, 1))\n", + " plt.axis('off')" + ], + "execution_count": 0, + "outputs": [] + }, + { + "metadata": { + "colab_type": "text", + "id": "m_Nkv8xXOrdT" + }, + "cell_type": "markdown", + "source": [ + "## Search procedure" + ] + }, + { + "metadata": { + "colab_type": "text", + "id": "-ed6OU8FOrdU" + }, + "cell_type": "markdown", + "source": [ + "Now that we have a way to reconstruct images, we need to set up an optimization procedure to find a person that will be the most similar to the one we are thinking about. To do so, we need to set up some scoring utility. Imagine that you want to generate an image of Brad Pitt. You start with a small number of random samples, say 5, and rank them according to their similarity to your vision of Brad Pitt: 1 for the worst, 5 for the best. You then rate image by image using GPyOpt that works in a latent space of VAE. For the new image, you need to somehow assign a real number that will show how good this image is. The simple idea is to ask a user to compare a new image with previous images (along with their scores). A user then enters score to a current image.\n", + "\n", + "The proposed scoring has a lot of drawbacks, and you may feel free to come up with new ones: e.g. showing user 9 different images and asking a user which image looks the \"best\".\n", + "\n", + "Note that the goal of this task is for you to implement a new algorithm by yourself. You may try different techniques for your task and select one that works the best." + ] + }, + { + "metadata": { + "colab_type": "text", + "id": "PeDzCr5oOrdW" + }, + "cell_type": "markdown", + "source": [ + "Task 2: Implement person search using Bayesian optimization. (You can use code from the assignment on Gaussian Processes)\n", + "\n", + "Note: try varying `acquisition_type` and `acquisition_par` parameters." + ] + }, + { + "metadata": { + "colab_type": "code", + "id": "0VeSrdJcOrdX", + "colab": {} + }, + "cell_type": "code", + "source": [ + "class FacialComposit:\n", + " def __init__(self, decoder, latent_size):\n", + " self.latent_size = latent_size\n", + " self.latent_placeholder = tf.placeholder(tf.float32, (1, latent_size))\n", + " self.decode = decoder(self.latent_placeholder)\n", + " self.samples = None\n", + " self.images = None\n", + " self.rating = None\n", + "\n", + " def _get_image(self, latent):\n", + " img = sess.run(self.decode, \n", + " feed_dict={self.latent_placeholder: latent[None, :]})[0]\n", + " img = np.clip(img, 0, 1)\n", + " return img\n", + "\n", + " @staticmethod\n", + " def _show_images(images, titles):\n", + " assert len(images) == len(titles)\n", + " clear_output()\n", + " plt.figure(figsize=(3*len(images), 3))\n", + " n = len(titles)\n", + " for i in range(n):\n", + " plt.subplot(1, n, i+1)\n", + " plt.imshow(images[i])\n", + " plt.title(str(titles[i]))\n", + " plt.axis('off')\n", + " plt.show()\n", + "\n", + " @staticmethod\n", + " def _draw_border(image, w=2):\n", + " bordred_image = image.copy()\n", + " bordred_image[:, :w] = [1, 0, 0]\n", + " bordred_image[:, -w:] = [1, 0, 0]\n", + " bordred_image[:w, :] = [1, 0, 0]\n", + " bordred_image[-w:, :] = [1, 0, 0]\n", + " return bordred_image\n", + "\n", + " def query_initial(self, n_start=5, select_top=None):\n", + " '''\n", + " Creates initial points for Bayesian optimization\n", + " Generate *n_start* random images and asks user to rank them.\n", + " Gives maximum score to the best image and minimum to the worst.\n", + " :param n_start: number of images to rank initialy.\n", + " :param select_top: number of images to keep\n", + " '''\n", + " self.samples = ### YOUR CODE HERE (size: select_top x 64 x 64 x 3)\n", + " self.images = ### YOUR CODE HERE (size: select_top x 64 x 64 x 3)\n", + " self.rating = ### YOUR CODE HERE (size: select_top)\n", + " \n", + " ### YOUR CODE:\n", + " ### Show user some samples (hint: use self._get_image and input())\n", + "\n", + " # Check that tensor sizes are correct\n", + " np.testing.assert_equal(self.rating.shape, [select_top])\n", + " np.testing.assert_equal(self.images.shape, [select_top, 64, 64, 3])\n", + " np.testing.assert_equal(self.samples.shape, [select_top, self.latent_size])\n", + "\n", + " def evaluate(self, candidate):\n", + " '''\n", + " Queries candidate vs known image set.\n", + " Adds candidate into images pool.\n", + " :param candidate: latent vector of size 1xlatent_size\n", + " '''\n", + " initial_size = len(self.images)\n", + " \n", + " ### YOUR CODE HERE\n", + " ## Show user an image and ask to assign score to it.\n", + " ## You may want to show some images to user along with their scores\n", + " ## You should also save candidate, corresponding image and rating\n", + " \n", + " candidate_rating = ### YOUR CODE HERE\n", + " \n", + " assert len(self.images) == initial_size + 1\n", + " assert len(self.rating) == initial_size + 1\n", + " assert len(self.samples) == initial_size + 1\n", + " return candidate_rating\n", + "\n", + " def optimize(self, n_iter=10, w=4, acquisition_type='MPI', acquisition_par=0.3):\n", + " if self.samples is None:\n", + " self.query_initial()\n", + "\n", + " bounds = [{'name': 'z_{0:03d}'.format(i),\n", + " 'type': 'continuous',\n", + " 'domain': (-w, w)} \n", + " for i in range(self.latent_size)]\n", + " optimizer = GPyOpt.methods.BayesianOptimization(f=self.evaluate, domain=bounds,\n", + " acquisition_type = acquisition_type,\n", + " acquisition_par = acquisition_par,\n", + " exact_eval=False, # Since we are not sure\n", + " model_type='GP',\n", + " X=self.samples,\n", + " Y=self.rating[:, None],\n", + " maximize=True)\n", + " optimizer.run_optimization(max_iter=n_iter, eps=-1)\n", + "\n", + " def get_best(self):\n", + " index_best = np.argmax(self.rating)\n", + " return self.images[index_best]\n", + "\n", + " def draw_best(self, title=''):\n", + " index_best = np.argmax(self.rating)\n", + " image = self.images[index_best]\n", + " plt.imshow(image)\n", + " plt.title(title)\n", + " plt.axis('off')\n", + " plt.show()" + ], + "execution_count": 0, + "outputs": [] + }, + { + "metadata": { + "colab_type": "text", + "id": "me4go7pXOrdc" + }, + "cell_type": "markdown", + "source": [ + "#### GRADED 2 (3 points):\n", + "Describe your approach below: How do you assign a score to a new image? How do you select reference images to help user assign a new score? What are the limitations of your approach?" + ] + }, + { + "metadata": { + "colab_type": "text", + "id": "U3svS-VdOrdc" + }, + "cell_type": "markdown", + "source": [ + "> Your answer here" + ] + }, + { + "metadata": { + "colab_type": "text", + "id": "Sw52dR9fOrdd" + }, + "cell_type": "markdown", + "source": [ + "## Testing your algorithm" + ] + }, + { + "metadata": { + "colab_type": "text", + "id": "dz6HxoBaOrde" + }, + "cell_type": "markdown", + "source": [ + "In these sections, we will apply the implemented app to search for different people. Each task will ask you to generate images that will have some property like \"dark hair\" or \"mustache\". You will need to run your search algorithm and provide the best discovered image." + ] + }, + { + "metadata": { + "colab_type": "text", + "id": "M4pSInvJOrdf" + }, + "cell_type": "markdown", + "source": [ + "#### Task 3.1: Finding person with darkest hair (3 points)" + ] + }, + { + "metadata": { + "colab_type": "code", + "id": "X-dHUCanOrdg", + "colab": {} + }, + "cell_type": "code", + "source": [ + "composit = FacialComposit(decoder, 8)\n", + "composit.optimize()" + ], + "execution_count": 0, + "outputs": [] + }, + { + "metadata": { + "colab_type": "code", + "id": "2DxqZwdPOrdl", + "colab": {} + }, + "cell_type": "code", + "source": [ + "composit.draw_best('Darkest hair')" + ], + "execution_count": 0, + "outputs": [] + }, + { + "metadata": { + "colab_type": "text", + "id": "wtl3xYbOOrdq" + }, + "cell_type": "markdown", + "source": [ + "#### Task 3.2. Finding person with the widest smile (3 points)" + ] + }, + { + "metadata": { + "colab_type": "code", + "id": "Th-Y3kb6Ordr", + "colab": {} + }, + "cell_type": "code", + "source": [ + "composit = FacialComposit(decoder, 8)\n", + "composit.optimize()" + ], + "execution_count": 0, + "outputs": [] + }, + { + "metadata": { + "colab_type": "code", + "id": "WyULC15WOrdt", + "colab": {} + }, + "cell_type": "code", + "source": [ + "composit.draw_best('Widest smile')" + ], + "execution_count": 0, + "outputs": [] + }, + { + "metadata": { + "colab_type": "text", + "id": "dIt2AbjIOrdv" + }, + "cell_type": "markdown", + "source": [ + "#### Task 3.3. Finding Daniil Polykovskiy or Alexander Novikov — lecturers of this course (3 points) " + ] + }, + { + "metadata": { + "colab_type": "text", + "id": "D4WWR8TsOrdw" + }, + "cell_type": "markdown", + "source": [ + "Note: this task highly depends on the quality of a VAE and a search algorithm. You may need to restart your search algorithm a few times and start with larget initial set." + ] + }, + { + "metadata": { + "colab_type": "code", + "id": "uQxSxNyfOrdw", + "colab": {} + }, + "cell_type": "code", + "source": [ + "composit = FacialComposit(decoder, 8)\n", + "composit.optimize()" + ], + "execution_count": 0, + "outputs": [] + }, + { + "metadata": { + "colab_type": "code", + "id": "7zZuCLJQOrd0", + "colab": {} + }, + "cell_type": "code", + "source": [ + "composit.draw_best('Lecturer')" + ], + "execution_count": 0, + "outputs": [] + }, + { + "metadata": { + "colab_type": "text", + "id": "Ru8T_44GOrd2" + }, + "cell_type": "markdown", + "source": [ + "#### Don't forget to post resulting image of lecturers on the forum ;)" + ] + }, + { + "metadata": { + "colab_type": "text", + "id": "15c0hFCyOrd3" + }, + "cell_type": "markdown", + "source": [ + "#### Task 3.4. Finding specific person (optional, but very cool)" + ] + }, + { + "metadata": { + "colab_type": "text", + "id": "PHSAo6_zOrd4" + }, + "cell_type": "markdown", + "source": [ + "Now that you have a good sense of what your algorithm can do, here is an optional assignment for you. Think of a famous person and take look at his/her picture for a minute. Then use your app to create an image of the person you thought of. You can post it in the forum Final project: guess who!\n" + ] + }, + { + "metadata": { + "colab_type": "code", + "id": "77HswJGmOrd5", + "colab": {} + }, + "cell_type": "code", + "source": [ + "### Your code here" + ], + "execution_count": 0, + "outputs": [] + }, + { + "metadata": { + "colab_type": "text", + "id": "74eXA5esOrd7" + }, + "cell_type": "markdown", + "source": [ + "### Submission\n", + "You need to share this notebook via a link: click `SHARE` (in the top-right corner) $\\rightarrow$ `Get shareable link` and then paste the link into the assignment page.\n", + "\n", + "**Note** that the reviewers always see the current version of the notebook, so please do not remove or change the contents of the notebook until the review is done.\n", + "\n", + "##### If you are working locally (e.g. using Jupyter instead of Colab)\n", + "Please upload your notebook to Colab and share it via the link or upload the notebook to any file sharing service (e.g. Dropbox) and submit the link to the notebook through https://nbviewer.jupyter.org/, so by clicking on the link the reviewers will see it's contents (and will not need to download the file)." + ] + }, + { + "metadata": { + "colab_type": "text", + "id": "4kvJQPKkOrd7" + }, + "cell_type": "markdown", + "source": [ + "### Grading criterions" + ] + }, + { + "metadata": { + "colab_type": "text", + "id": "DDfqh-90Ord8" + }, + "cell_type": "markdown", + "source": [ + "#### Task 1 (3 points) [samples from VAE]\n", + "* 0 points: No results were provided here or provided images were not generated by VAE\n", + "* 1 point: Provided images poorly resemble faces\n", + "* 2 points: Provided images look like faces, maybe with some artifacts, but all look the same\n", + "* 3 points: Provided images look like faces, maybe with some artifacts, and look different" + ] + }, + { + "metadata": { + "colab_type": "text", + "id": "IZUOD-J0Ord9" + }, + "cell_type": "markdown", + "source": [ + "#### Task 2 (3 points) [training procedure]\n", + "* 0 points: No result was provided\n", + "* 1 point: Some algorithm was proposed, but it does not use Bayesian optimization\n", + "* 2 points: Algorithm was proposed, but there were no details on some important aspects: how to assign a score to a new image / how to you select a new image / what are the limitations of the approach\n", + "* 3 points: Algorithm was proposed, all questions in the task were answered" + ] + }, + { + "metadata": { + "colab_type": "text", + "id": "Ng8DZ9OjOrd9" + }, + "cell_type": "markdown", + "source": [ + "#### Tasks 3.1-3.3 (3 points each) [search for person]\n", + "* 0 points: Nothing was provided\n", + "* 1 point: Resulting image was provided, but some of the required images (evolution & nearest image) are not provided\n", + "* 2 points: All images are provided, but the resulting image does not have requested property\n", + "* 3 points: All images are provided, the resulting image has required features (long hair / wide smile / looks like lecturer)" + ] + }, + { + "metadata": { + "colab_type": "text", + "id": "MycobvSSOrd-" + }, + "cell_type": "markdown", + "source": [ + "## Passing grade is 60% (9 points)" + ] + } + ] +} \ No newline at end of file