Diffusion Deepdive

Exploration-based project aimed at understanding how diffusion models work by implementing various diffusion models.

Approach:

Single-Image Generation (to verify that diffusion does work)
Single-Class Generation (Landscape data from Kaggle)
Multi-Class Conditional Generation (CIFAR 10)
Latent Diffusion (CIFAR 10)

Single-Image Generation

This is to verify that diffusion does work by having the entire dataset be a single image. More of a sanity check before moving to more complex tasks. All Training done on 1 x RTX 2080

Oski (32 x 32)

VLSB Santa Dino (64 x 64)

Rare creature only to be spotted during christmas seasons in the VLSB building

Single-Class Generation

Introduction: Extend the Single-Image Generation into Single-Class Generation. Test with the landscape data first.

Landscape Data

1. Training Run #1

epochs = 500
lr = 1e-3
bs = 16 #gpu is 2080...
diffusion_timesteps = 1000

Training Time: ~9 hours

Summary Result

Detailed Sampling

Question:

One question came up when I was looking at the outputs:

Why does the model seem to generate complicated parts of a scene(e.g. mountains, grass) first?
- For example, the model seemed to start generating the mountains rather than the sky or the grass before the sky.
- Is something with texture easier for a model to start generating from (i.e. easiest way to optimize for lower loss is constructing prominent features first)?

2. Training Run #2

epochs = 500
lr = 1e-4
bs = 16
diffusion_timesteps = 1000

Training Time: ~9 hours

Summary Result

Detailed Sampling

3. Training Run #3

epochs = 500 #should have increased epochs for this lr
lr = 1e-5 
bs = 16
diffusion_timesteps = 1000

Training Time: ~9 hours

Summary Result

Detailed Sampling

Italian Greyhound Data

just an extension since i like them

1. Training Run #1

epochs = 3000
lr = 1e-4
bs = 12
diffusion_timesteps = 1000

Training Time: ~ (err look at tf log) hours

Summary Result

Detailed Sampling

Interesting Papers

Extracting Training Data from Diffusion Models
- Shows how diffusion models spit out training data if prompted well.
- Security risks of diffusion models might be interesting to look at
Cold Diffusion: Inverting Arbitrary Image Transforms Without Noise
- Questions the necessity of random Gaussian noise for noising.
- Shows that other methods like burring or masking can be used in place of Gaussian noise to train diffusion models.
Understanding Diffusion Models: A Unified Perspective
- Overview of various diffusion models along with background on...
  - ELBO, VAE
  - Variational Diffusion Models
  - Score-based Generative Models
  - Guidance(CG and CFG)

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
single_class		single_class
single_img		single_img
.DS_Store		.DS_Store
README.md		README.md
environment.yaml		environment.yaml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Diffusion Deepdive

Exploration-based project aimed at understanding how diffusion models work by implementing various diffusion models.

Approach:

Single-Image Generation

Oski (32 x 32)

VLSB Santa Dino (64 x 64)

Single-Class Generation

Landscape Data

1. Training Run #1

Question:

2. Training Run #2

3. Training Run #3

Italian Greyhound Data

1. Training Run #1

Interesting Papers

About

Releases

Packages

Languages

henryhmko/diffusion_deepdive

Folders and files

Latest commit

History

Repository files navigation

Diffusion Deepdive

Exploration-based project aimed at understanding how diffusion models work by implementing various diffusion models.

Approach:

Single-Image Generation

Oski (32 x 32)

VLSB Santa Dino (64 x 64)

Single-Class Generation

Landscape Data

1. Training Run #1

Question:

2. Training Run #2

3. Training Run #3

Italian Greyhound Data

1. Training Run #1

Interesting Papers

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages