$ python3 main.py --epochs 50 --batch_size 128 --outdir "."
NOTE: on Colab Notebook use following command:
!git clone link-to-repo
%run main.py --epochs 50 --batch_size 128 --outdir "."
usage: main.py [-h] [--epochs EPOCHS] [--batch_size BATCH_SIZE] --outdir
OUTDIR [--learning_rate LEARNING_RATE] [--beta_1 BETA_1]
--encoding_dims ENCODING_DIMS
optional arguments:
-h, --help show this help message and exit
--epochs EPOCHS
--batch_size BATCH_SIZE
--outdir OUTDIR
--learning_rate LEARNING_RATE
--beta_1 BETA_1
--encoding_dims ENCODING_DIMS
- Title: Generative Adversarial Networks
- Authors: Ian J. Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, Yoshua Bengio
- Link: http://arxiv.org/abs/1406.2661
- Tags: Neural Network, GAN, generative models, unsupervised learning
- Year: 2014
-
What are GANs
- GANs are based on adversarial training.
- Adversarial training is a basic technique to train generative models (so here primarily models that create new images).
- In an adversarial training one model (G, Generator) generates things (e.g. images). Another model (D, discriminator) sees real things (e.g. real images) as well as fake things (e.g. images from G) and has to learn how to differentiate the two.
- Neural Networks are models that can be trained in an adversarial way (and are the only models discussed here).
-
Basic architecture of GANs
- G is a simple neural net (e.g. just one fully connected hidden layer). It takes a vector as input (e.g. 100 dimensions) and produces an image as output.
- D is a simple neural net (e.g. just one fully connected hidden layer). It takes an image as input and produces a quality rating as output (0-1, so sigmoid).
- You need a training set of things to be generated, e.g. images of human faces.
- Let the batch size be B.
- G is trained the following way:
- Create B vectors of 100 random values each, e.g. sampled uniformly from [-1, +1]. (Number of values per components depends on the chosen input size of G.)
- Feed forward the vectors through G to create new images.
- Feed forward the images through D to create ratings.
- Use a cross entropy loss on these ratings. All of these (fake) images should be viewed as label=0 by D. If D gives them label=1, the error will be low (G did a good job).
- Perform a backward pass of the errors through D (without training D). That generates gradients/errors per image and pixel.
- Perform a backward pass of these errors through G to train G.
- D is trained the following way:
- Create B/2 images using G (again, B/2 random vectors, feed forward through G).
- Chose B/2 images from the training set. Real images get label=1.
- Merge the fake and real images to one batch. Fake images get label=0.
- Feed forward the batch through D.
- Measure the error using cross entropy.
- Perform a backward pass with the error through D.
- Train G for one batch, then D for one (or more) batches. Sometimes D can be too slow to catch up with D, then you need more iterations of D per batch of G.
-
Results
- Good looking images MNIST-numbers and human faces. (Grayscale, rather homogeneous datasets.)
- Not so good looking images of CIFAR-10. (Color, rather heterogeneous datasets.)
-
We have implemented the GAN model with the following architectures :
-
Generator Architecture
-
Discriminator Architecture
-
The following GIF shows how our model has improved generating digits after 400 epochs of training
-
The image generated by our model after the first epoch
-
The image generated by our model after the 400th epoch
-
The Generator loss for our model
-
The Discriminator loss for our model