Generative models are a class of models for unsupervised learning that given a training dataset will try to generate new samples from the same distribution. The dataset used for training is considered as a set of samples taken from a particular data distibution and the model is learning to map a simpler distribution (Gaussian distribution) from which we can easily sample new data to this more complex one (the one actually representing the data). As the model has a significantly lower number of parameters than the dataset that we use for training, the model is forced to discover and efficiently internalize the essence of the data in order to generate it.
Internally, the model is a function with parameters θ, and tweaking these parameters will tweak the generated distribution of images. The goal during training is then to find parameters θ that produce a distribution that closely matches the true data distribution (for example, by having a small KL divergence loss).
It exists several approaches in the domain
- Variational Auto Encoders
- Generative Adverserial Networks
- Auto regressive models
VAE are latent variable models. Such models rely on the idea that data generated by the model needs to be parametrised by some variables that will generate some specific caracteristics of that given data point. This variable is called latent and the space of all possible latent variables is called the latent space. This name comes from the fact that given just a data point produced by the model, we don’t necessarily know which settings of the latent variables generated this data point.
One of the key idea behind VAE is that instead of trying to construct a latent space explicitly and to sample from it in order to find samples that could actually generate proper outputs (as close as possible to our distribution), we construct an Encoder-Decoder like netwkork which will split the work in two parts:
- The encoder learns to generate a distribution depending on input samples X from which we can sample a latent variable that is highly likelly to generate X samples. In other words we learn a set of parameters θ1 that generate a distribution Q(X,θ1) from which we can sample a latent variable z maximizing P(X|z).
- The decoder part learns to generate an output which belongs to the real data distribution given a latent variable z given in input.In other words, we learn a set of parameters θ2 that generates a function f(z,θ2) that maps the latent distibution that we learned to the real data distribution of the dataset.
- Data generation and augmentation
- An Introduction to Variational Auto Encoders
- Tutorial on Variational Auto Encoder
- Auto-Encoding Variational Bayes
- Improved Variational Inference with Inversed Autoregressive Flow
- Data Analysis with Latent Variable Models
As VAEs, generative adversarial networks are mapping a simple distribution (our latent space composed or random variables) into the data distribution that we try to model. This part of GANs is called the generator and it can be compared to the decoder part of VAEs.
One of the key idea behind GANs is that instead comparing the generator distribution and the true distribution of data by looking directly at the samples used for training (as it is the case for VAEs), it learns how close are the two distributions using a discriminator. This discriminator learns how to differentiate the true distribution from the generated distribution, in other words real samples from fake samples.
During the training, we will try to minimise the error of the generator while maximising the error of the discriminator. The discriminator will have the most difficulty to predict the class when the two distributions will be equal in all points: in this case, for each point there are equal chances for it to be “true” or “generated” and then the discriminator can’t do better than being true in one case out of two in average.
More details can be found here:
- Generative Adverserial Networks: From theory to practice (Article)
- Generative Adverserial Networks: Practice (Notebook)
- Data generation
- Style transfert
- Text-to-Image Translation
- Super Resolution
- 3D Object Generation
- Photo Inpainting
- Generative Adverserial Nets
- Improved Technics for Training GANs
- Variational Approaches for Auto Encoding GANs
- Autoencoding Beyond Pixels using Learned Similarity Metrics
- Pixel Recurrent Neural Network