Exploration-based project aimed at understanding how diffusion models work by implementing various diffusion models.
- Single-Image Generation (to verify that diffusion does work)
- Single-Class Generation (Landscape data from Kaggle)
- Multi-Class Conditional Generation (CIFAR 10)
- Latent Diffusion (CIFAR 10)
This is to verify that diffusion does work by having the entire dataset be a single image. More of a sanity check before moving to more complex tasks. All Training done on 1 x RTX 2080
Rare creature only to be spotted during christmas seasons in the VLSB building
Introduction: Extend the Single-Image Generation into Single-Class Generation. Test with the landscape data first.
epochs = 500
lr = 1e-3
bs = 16 #gpu is 2080...
diffusion_timesteps = 1000
Training Time: ~9 hours
Summary Result
Detailed Sampling
One question came up when I was looking at the outputs:
- Why does the model seem to generate complicated parts of a scene(e.g. mountains, grass) first?
- For example, the model seemed to start generating the mountains rather than the sky or the grass before the sky.
- Is something with texture easier for a model to start generating from (i.e. easiest way to optimize for lower loss is constructing prominent features first)?
epochs = 500
lr = 1e-4
bs = 16
diffusion_timesteps = 1000
Training Time: ~9 hours
Summary Result
Detailed Sampling
epochs = 500 #should have increased epochs for this lr
lr = 1e-5
bs = 16
diffusion_timesteps = 1000
Training Time: ~9 hours
Summary Result
Detailed Sampling
just an extension since i like them
epochs = 3000
lr = 1e-4
bs = 12
diffusion_timesteps = 1000
Training Time: ~ (err look at tf log) hours
Summary Result
Detailed Sampling
- Extracting Training Data from Diffusion Models
- Shows how diffusion models spit out training data if prompted well.
- Security risks of diffusion models might be interesting to look at
- Cold Diffusion: Inverting Arbitrary Image Transforms Without Noise
- Questions the necessity of random Gaussian noise for noising.
- Shows that other methods like burring or masking can be used in place of Gaussian noise to train diffusion models.
- Understanding Diffusion Models: A Unified Perspective
- Overview of various diffusion models along with background on...
- ELBO, VAE
- Variational Diffusion Models
- Score-based Generative Models
- Guidance(CG and CFG)
- Overview of various diffusion models along with background on...