[Sana] Add Sana, including `SanaPipeline`, `SanaPAGPipeline`, `LinearAttentionProcessor`, `Flow-based DPM-sovler` and so on. #9982

lawrence-cj · 2024-11-21T06:16:57Z

What does this PR do?

This PR will add the official Sana (SANA: Efficient High-Resolution Image Synthesis with Linear Diffusion Transformer) into the diffusers lib. Sana first makes the Text-to-Image available on 32x compressed latent space, powered by DC-AE(https://arxiv.org/abs/2410.10733v1) without performance degradation. Also, Sana contains several popular efficiency related techs, like DiT with Linear Attention processor and we use Decoder-only LLM (Gemma-2B-IT) for low GPU requirement and fast speed.

Paper: https://arxiv.org/abs/2410.10629
Original code repo: https://github.com/NVlabs/Sana
Project: https://nvlabs.github.io/Sana

Core contributor of DC-AE:
work with @[email protected]

Core library:

We want to collaborate on this PR together with friends from HF. Feel free to contact me here. Cc: @sayakpaul, @yiyixuxu

Core library:

Schedulers: @yiyixuxu
Pipelines and pipeline callbacks: @yiyixuxu and @asomoza
Docs: @stevhliu and @sayakpaul
General functionalities: @sayakpaul @yiyixuxu @DN6

HF projects:

transformers: different repo
safetensors: different repo

-->

Images is generated by `SanaPAGPipeline` with `FlowDPMSolverMultistepScheduler`

# Conflicts: # src/diffusers/models/normalization.py

2. make style and make quality;

Co-authored-by: Steven Liu <[email protected]>

…ingface#9981) * fix * update expected slice

* skip nan lora tests on PyTorch 2.5.1 CPU. * cog * use xfail * correct xfail * add condition * tests

* enable on xpu * add 1 more * add one more * enable more * add 1 more * add more * enable 1 * enable more cases * enable * enable * update comment * one more * enable 1 * add more cases * enable xpu * add one more caswe * add more cases * add 1 * add more * add more cases * add case * enable * add more * add more * add more * enbale more * add more * update code * update test marker * add skip back * update comment * remove single files * remove * style * add * revert * reformat * update decorator * update * update * update * Update tests/pipelines/deepfloyd_if/test_if.py Co-authored-by: Dhruv Nair <[email protected]> * Update src/diffusers/utils/testing_utils.py Co-authored-by: Dhruv Nair <[email protected]> * Update tests/pipelines/animatediff/test_animatediff_controlnet.py Co-authored-by: Dhruv Nair <[email protected]> * Update tests/pipelines/animatediff/test_animatediff.py Co-authored-by: Dhruv Nair <[email protected]> * Update tests/pipelines/animatediff/test_animatediff_controlnet.py Co-authored-by: Dhruv Nair <[email protected]> * update float16 * no unitest.skipt * update * apply style check * reapply format --------- Co-authored-by: Sayak Paul <[email protected]> Co-authored-by: Dhruv Nair <[email protected]>

Co-authored-by: Dhruv Nair <[email protected]>

* update --------- Co-authored-by: yiyixuxu <[email protected]> Co-authored-by: Sayak Paul <[email protected]>

* smol change to fix checkpoint saving & resuming (as done in train_dreambooth_sd3.py) * style * modify comment to explain reasoning behind hidden size check

add: missing pipelines from the spec.

) Add files via upload

# Conflicts: # src/diffusers/__init__.py

lawrence-cj · 2024-11-27T04:36:50Z

so you're licensing this code to fit into the Diffusers project? because the original Sana codebase is non-commercial. why is that NC but this is being opened as Apache 2.0*?

The license of Sana's code base is changed to Apacha 2.0 @bghira . Refer to: https://github.com/NVlabs/Sana?tab=Apache-2.0-1-ov-file

yiyixuxu · 2024-11-27T08:53:51Z

src/diffusers/schedulers/scheduling_dpmsolver_multistep.py

@@ -407,6 +409,11 @@ def set_timesteps(
            sigmas = np.flip(sigmas).copy()
            sigmas = self._convert_to_beta(in_sigmas=sigmas, num_inference_steps=num_inference_steps)
            timesteps = np.array([self._sigma_to_t(sigma, log_sigmas) for sigma in sigmas])
+        elif self.config.use_flow_sigmas:


@lawrence-cj can we use karras_sigmas/exponential_sigmas/beta_sigmas with flow-matching? (i.e. use_beta_sigmas=True and prediction_type="flow_prediction")

Suggested change

elif self.config.use_flow_sigmas:

elif self.config.use_beta_sigmas:

sigmas = np.flip(sigmas).copy()

sigmas = self._convert_to_beta(in_sigmas=sigmas, num_inference_steps=num_inference_steps)

timesteps = np.array([self._sigma_to_t(sigma, log_sigmas) for sigma in sigmas])

if self.config.use_flow_sigmas:

alphas = np.linspace(1, 1 / self.config.num_train_timesteps, num_inference_steps + 1)

sigmas = 1.0 - alphas

sigmas = np.flip(self.config.flow_shift * sigmas / (1 + (self.config.flow_shift - 1) * sigmas))[:-1]

timesteps = (sigmas * self.config.num_train_timesteps).copy()

Do you mean the logic like this, but with some change of code?

umm not sure what you mean, here in your suggested code change you have this,

if self.config.use_beta_sigmas if self.config.use_flow_sigmas: ....

I don't think we can configure use_beta_sigma and use_flow_sigmas to be True at the same time. However, we should be able to configure use_beta_sigma=True and prediction_type="flow_prediction" at the same time, no? basically, you will still be doing flow match but use a "noise schedule" that's not uniform

I just wonder what happens if the user does that with the current implementation. Would it work?

you will still be doing flow match but use a "noise schedule" that's not uniform

Then, it will not work I think, the noise schedule has to be uniform as it's defined in SD3, Flux and Sana.
If I understand correct, the code will work, like:

if self.config.use_beta_sigmas if prediction_type=="flow_prediction": ....

But, I'm not sure if the sigma here is possible to be uniform here:

diffusers/src/diffusers/schedulers/scheduling_dpmsolver_multistep.py

Lines 596 to 605 in 29e93b7

sigmas = np.array(

[

sigma_min + (ppf * (sigma_max - sigma_min))

for ppf in [

scipy.stats.beta.ppf(timestep, alpha, beta)

for timestep in 1 - np.linspace(0, 1, num_inference_steps)

]

]

)

return sigmas

If the sigma can't be uniform, I think it's not reasonable to place prediction_type="flow_prediction" under use_beta_sigma=True

ohh I was under the impression that we can use other sigmas distribution for flow-matching see #10001 (comment)

but we don't have to worry about it for this PR. If you have time and are interested in investigating this it would be great! :) if not, we can just make sure user are only be able use use_flow_sigma=True when prediction_type="flow_prediction", i.e. throw an error if not the case

given the code change is so minimum I think we can just keep the change in DPM scheduler for now (we can remove that new scheduler file)

this creates some inconsistency across the library (for euler and heun we have a separate flow match scheduler); but since the change is tiny a& we are going through some scheduler refactoring soon, I think it is ok!

Yah, the only concern is if we maintain the flow prediction in original DPM scheduler, there are lots of unrelated code that's only for original dpm-solver. But, if you can refactor and integrate flow into original file nicely, i'm very ok with it! :)

ohh I was under the impression that we can use other sigmas distribution for flow-matching see #10001 (comment)

but we don't have to worry about it for this PR. If you have time and are interested in investigating this it would be great! :) if not, we can just make sure user are only be able use use_flow_sigma=True when prediction_type="flow_prediction", i.e. throw an error if not the case

Interesting, and it makes sense. I had a experiment before, If I train the model with timestep_shift=3 and usetimestep_shift=4 to inference will also work well. This may explain why the sigmas change a little bit, but the model can still work, specially for the large model, like FLUX. I'll figure it out in my later update in this PR.

# Conflicts: # src/diffusers/models/autoencoders/dc_ae.py

lawrence-cj · 2024-11-30T08:33:24Z

Hi, we prepared two model repos for you guys to test the correctness of both SanaPipeline and SanaPAGPipline:

https://huggingface.co/Efficient-Large-Model/Sana_pag_1600M_1024px_diffusers
https://huggingface.co/Efficient-Large-Model/Sana_1600M_1024px_diffusers
Gentle ping @bghira @yiyixuxu @stevhliu @a-r-r-o-w @sayakpaul

lawrence-cj and others added 30 commits October 18, 2024 17:40

first add a script for DC-AE;

6e616a9

Merge remote-tracking branch 'upstream/main' into DC-AE

d2e187a

DC-AE init

90e8939

replace triton with custom implementation

825c975

1. rename file and remove un-used codes;

3a44fa4

no longer rely on omegaconf and dataclass

55b2615

merge

6fb7fdb

Merge remote-tracking branch 'upstream/main' into DC-AE

c323e76

replace custom activation with diffuers activation

da7caa5

remove dc_ae attention in attention_processor.py

fb6d92a

iinherit from ModelMixin

5e63a1a

inherit from ConfigMixin

72cce2b

dc-ae reduce to one file

8f9b4e4

Merge remote-tracking branch 'upstream/main' into DC-AE

b7f68f9

Merge branch 'huggingface:main' into DC-AE

6d96b95

Merge remote-tracking branch 'refs/remotes/origin/main' into DC-AE

3c3cc51

# Conflicts: # src/diffusers/models/normalization.py

1. add DCAE into diffusers;

3b18ef4

2. make style and make quality;

add DCAE_HF into diffusers;

a62bd75

bug fixed;

09c6c00

add SanaPipeline, SanaTransformer2D into diffusers;

b9741af

add sanaLinearAttnProcessor2_0;

4df1722

first update for SanaTransformer;

8a7b24d

first update for SanaPipeline;

d1b4834

first success run SanaPipeline;

2416c77

model output finally match with original model with the same intput;

ce24e41

code update;

e7193b4

code update;

e78bdb5

code update;

a1ef876

update downsample and upsample

1448681

merge

bf40fe8

lawrence-cj and others added 17 commits November 27, 2024 00:44

Update src/diffusers/models/transformers/sana_transformer_2d.py

3663d55

Co-authored-by: Steven Liu <[email protected]>

Update src/diffusers/models/transformers/sana_transformer_2d.py

9e0bc7e

Co-authored-by: Steven Liu <[email protected]>

Update src/diffusers/pipelines/pag/pipeline_pag_sana.py

495e847

Co-authored-by: Steven Liu <[email protected]>

Update src/diffusers/models/transformers/sana_transformer_2d.py

24cd066

Co-authored-by: Steven Liu <[email protected]>

Update src/diffusers/models/transformers/sana_transformer_2d.py

ec55777

Co-authored-by: Steven Liu <[email protected]>

Update src/diffusers/pipelines/sana/pipeline_sana.py

ca8c6ca

Co-authored-by: Steven Liu <[email protected]>

Update src/diffusers/pipelines/sana/pipeline_sana.py

f14ea61

Co-authored-by: Steven Liu <[email protected]>

Fix prepare latent image ids and vae sample generators for flux (hugg…

8c9fb02

…ingface#9981) * fix * update expected slice

[Tests] skip nan lora tests on PyTorch 2.5.1 CPU. (huggingface#9975)

0a03934

* skip nan lora tests on PyTorch 2.5.1 CPU. * cog * use xfail * correct xfail * add condition * tests

ControlNet from_single_file when already converted (huggingface#9978)

3379031

Co-authored-by: Dhruv Nair <[email protected]>

Flux Fill, Canny, Depth, Redux (huggingface#9985)

fe2affe

* update --------- Co-authored-by: yiyixuxu <[email protected]> Co-authored-by: Sayak Paul <[email protected]>

[SD3 dreambooth lora] smol fix to checkpoint saving (huggingface#9993)

b8c5403

* smol change to fix checkpoint saving & resuming (as done in train_dreambooth_sd3.py) * style * modify comment to explain reasoning behind hidden size check

[Docs] add: missing pipelines from the spec. (huggingface#10005)

4bbd4a5

add: missing pipelines from the spec.

Add prompt about wandb in examples/dreambooth/readme. (huggingface#10014

f3df07b

) Add files via upload

Merge remote-tracking branch 'refs/remotes/origin/main' into DC-AE-Sana

29e93b7

# Conflicts: # src/diffusers/__init__.py

Merge branch 'main' into DC-AE

4495783

yiyixuxu reviewed Nov 27, 2024

View reviewed changes

yiyixuxu mentioned this pull request Nov 27, 2024

Can we get more schedulers for flow based models such as SD3, SD3.5, and flux #9924

Open

chenjy2003 and others added 8 commits November 28, 2024 01:01

change file name to autoencoder_dc

4d3c026

Merge branch 'DC-AE' of github.com:lawrence-cj/diffusers into DC-AE

e007057

move LiteMLA to attention.py

d3d9c84

Merge branch 'refs/heads/main' into DC-AE-Sana

1f08631

Merge branch 'refs/heads/DC-AE' into DC-AE-Sana

c67753e

# Conflicts: # src/diffusers/models/autoencoders/dc_ae.py

update Sana for DC-AE's recent commit;

23f780f

make style && make quality

ea4cfc5

Merge branch 'main' into Sana

996606e

Heasterian mentioned this pull request Nov 29, 2024

[Feat]: Support Sana Nerogar/OneTrainer#589

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Sana] Add Sana, including `SanaPipeline`, `SanaPAGPipeline`, `LinearAttentionProcessor`, `Flow-based DPM-sovler` and so on. #9982

[Sana] Add Sana, including `SanaPipeline`, `SanaPAGPipeline`, `LinearAttentionProcessor`, `Flow-based DPM-sovler` and so on. #9982

lawrence-cj commented Nov 21, 2024

lawrence-cj commented Nov 27, 2024 •

edited

Loading

yiyixuxu Nov 27, 2024

lawrence-cj Nov 27, 2024

yiyixuxu Nov 27, 2024

yiyixuxu Nov 27, 2024

lawrence-cj Nov 27, 2024

yiyixuxu Nov 27, 2024

yiyixuxu Nov 27, 2024

lawrence-cj Nov 27, 2024

lawrence-cj Nov 27, 2024 •

edited

Loading

lawrence-cj commented Nov 30, 2024 •

edited

Loading

	sigmas = np.array(
	[
	sigma_min + (ppf * (sigma_max - sigma_min))
	for ppf in [
	scipy.stats.beta.ppf(timestep, alpha, beta)
	for timestep in 1 - np.linspace(0, 1, num_inference_steps)
	]
	]
	)
	return sigmas

[Sana] Add Sana, including SanaPipeline, SanaPAGPipeline, LinearAttentionProcessor, Flow-based DPM-sovler and so on. #9982

Are you sure you want to change the base?

[Sana] Add Sana, including SanaPipeline, SanaPAGPipeline, LinearAttentionProcessor, Flow-based DPM-sovler and so on. #9982

Conversation

lawrence-cj commented Nov 21, 2024

What does this PR do?

Images is generated by SanaPAGPipeline with FlowDPMSolverMultistepScheduler

lawrence-cj commented Nov 27, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

lawrence-cj Nov 27, 2024 • edited Loading

Choose a reason for hiding this comment

lawrence-cj commented Nov 30, 2024 • edited Loading

[Sana] Add Sana, including `SanaPipeline`, `SanaPAGPipeline`, `LinearAttentionProcessor`, `Flow-based DPM-sovler` and so on. #9982

[Sana] Add Sana, including `SanaPipeline`, `SanaPAGPipeline`, `LinearAttentionProcessor`, `Flow-based DPM-sovler` and so on. #9982

Images is generated by `SanaPAGPipeline` with `FlowDPMSolverMultistepScheduler`

lawrence-cj commented Nov 27, 2024 •

edited

Loading

lawrence-cj Nov 27, 2024 •

edited

Loading

lawrence-cj commented Nov 30, 2024 •

edited

Loading