-
Notifications
You must be signed in to change notification settings - Fork 5.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Sana] Add Sana, including SanaPipeline
, SanaPAGPipeline
, LinearAttentionProcessor
, Flow-based DPM-sovler
and so on.
#9982
base: main
Are you sure you want to change the base?
Conversation
# Conflicts: # src/diffusers/models/normalization.py
2. make style and make quality;
Co-authored-by: Steven Liu <[email protected]>
Co-authored-by: Steven Liu <[email protected]>
Co-authored-by: Steven Liu <[email protected]>
Co-authored-by: Steven Liu <[email protected]>
Co-authored-by: Steven Liu <[email protected]>
Co-authored-by: Steven Liu <[email protected]>
Co-authored-by: Steven Liu <[email protected]>
…ingface#9981) * fix * update expected slice
* skip nan lora tests on PyTorch 2.5.1 CPU. * cog * use xfail * correct xfail * add condition * tests
* enable on xpu * add 1 more * add one more * enable more * add 1 more * add more * enable 1 * enable more cases * enable * enable * update comment * one more * enable 1 * add more cases * enable xpu * add one more caswe * add more cases * add 1 * add more * add more cases * add case * enable * add more * add more * add more * enbale more * add more * update code * update test marker * add skip back * update comment * remove single files * remove * style * add * revert * reformat * update decorator * update * update * update * Update tests/pipelines/deepfloyd_if/test_if.py Co-authored-by: Dhruv Nair <[email protected]> * Update src/diffusers/utils/testing_utils.py Co-authored-by: Dhruv Nair <[email protected]> * Update tests/pipelines/animatediff/test_animatediff_controlnet.py Co-authored-by: Dhruv Nair <[email protected]> * Update tests/pipelines/animatediff/test_animatediff.py Co-authored-by: Dhruv Nair <[email protected]> * Update tests/pipelines/animatediff/test_animatediff_controlnet.py Co-authored-by: Dhruv Nair <[email protected]> * update float16 * no unitest.skipt * update * apply style check * reapply format --------- Co-authored-by: Sayak Paul <[email protected]> Co-authored-by: Dhruv Nair <[email protected]>
Co-authored-by: Dhruv Nair <[email protected]>
* update --------- Co-authored-by: yiyixuxu <[email protected]> Co-authored-by: Sayak Paul <[email protected]>
* smol change to fix checkpoint saving & resuming (as done in train_dreambooth_sd3.py) * style * modify comment to explain reasoning behind hidden size check
add: missing pipelines from the spec.
# Conflicts: # src/diffusers/__init__.py
The license of Sana's code base is changed to Apacha 2.0 @bghira . Refer to: https://github.com/NVlabs/Sana?tab=Apache-2.0-1-ov-file |
@@ -407,6 +409,11 @@ def set_timesteps( | |||
sigmas = np.flip(sigmas).copy() | |||
sigmas = self._convert_to_beta(in_sigmas=sigmas, num_inference_steps=num_inference_steps) | |||
timesteps = np.array([self._sigma_to_t(sigma, log_sigmas) for sigma in sigmas]) | |||
elif self.config.use_flow_sigmas: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@lawrence-cj can we use karras_sigmas/exponential_sigmas/beta_sigmas with flow-matching? (i.e. use_beta_sigmas=True
and prediction_type="flow_prediction"
)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
elif self.config.use_flow_sigmas: | |
elif self.config.use_beta_sigmas: | |
sigmas = np.flip(sigmas).copy() | |
sigmas = self._convert_to_beta(in_sigmas=sigmas, num_inference_steps=num_inference_steps) | |
timesteps = np.array([self._sigma_to_t(sigma, log_sigmas) for sigma in sigmas]) | |
if self.config.use_flow_sigmas: | |
alphas = np.linspace(1, 1 / self.config.num_train_timesteps, num_inference_steps + 1) | |
sigmas = 1.0 - alphas | |
sigmas = np.flip(self.config.flow_shift * sigmas / (1 + (self.config.flow_shift - 1) * sigmas))[:-1] | |
timesteps = (sigmas * self.config.num_train_timesteps).copy() |
Do you mean the logic like this, but with some change of code?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
umm not sure what you mean, here in your suggested code change you have this,
if self.config.use_beta_sigmas
if self.config.use_flow_sigmas:
....
I don't think we can configure use_beta_sigma
and use_flow_sigmas
to be True at the same time. However, we should be able to configure use_beta_sigma=True
and prediction_type="flow_prediction"
at the same time, no? basically, you will still be doing flow match but use a "noise schedule" that's not uniform
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just wonder what happens if the user does that with the current implementation. Would it work?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you will still be doing flow match but use a "noise schedule" that's not uniform
Then, it will not work I think, the noise schedule has to be uniform as it's defined in SD3, Flux and Sana.
If I understand correct, the code will work, like:
if self.config.use_beta_sigmas
if prediction_type=="flow_prediction":
....
But, I'm not sure if the sigma here is possible to be uniform here:
diffusers/src/diffusers/schedulers/scheduling_dpmsolver_multistep.py
Lines 596 to 605 in 29e93b7
sigmas = np.array( | |
[ | |
sigma_min + (ppf * (sigma_max - sigma_min)) | |
for ppf in [ | |
scipy.stats.beta.ppf(timestep, alpha, beta) | |
for timestep in 1 - np.linspace(0, 1, num_inference_steps) | |
] | |
] | |
) | |
return sigmas |
If the sigma can't be uniform, I think it's not reasonable to place
prediction_type="flow_prediction"
under use_beta_sigma=True
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ohh I was under the impression that we can use other sigmas distribution for flow-matching see #10001 (comment)
but we don't have to worry about it for this PR. If you have time and are interested in investigating this it would be great! :) if not, we can just make sure user are only be able use use_flow_sigma=True
when prediction_type="flow_prediction"
, i.e. throw an error if not the case
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
given the code change is so minimum I think we can just keep the change in DPM scheduler for now (we can remove that new scheduler file)
this creates some inconsistency across the library (for euler and heun we have a separate flow match scheduler); but since the change is tiny a& we are going through some scheduler refactoring soon, I think it is ok!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yah, the only concern is if we maintain the flow prediction in original DPM scheduler, there are lots of unrelated code that's only for original dpm-solver. But, if you can refactor and integrate flow into original file nicely, i'm very ok with it! :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ohh I was under the impression that we can use other sigmas distribution for flow-matching see #10001 (comment)
but we don't have to worry about it for this PR. If you have time and are interested in investigating this it would be great! :) if not, we can just make sure user are only be able use
use_flow_sigma=True
whenprediction_type="flow_prediction"
, i.e. throw an error if not the case
Interesting, and it makes sense. I had a experiment before, If I train the model with timestep_shift=3 and usetimestep_shift=4 to inference will also work well. This may explain why the sigmas change a little bit, but the model can still work, specially for the large model, like FLUX. I'll figure it out in my later update in this PR.
# Conflicts: # src/diffusers/models/autoencoders/dc_ae.py
Hi, we prepared two model repos for you guys to test the correctness of both https://huggingface.co/Efficient-Large-Model/Sana_pag_1600M_1024px_diffusers |
What does this PR do?
This PR will add the official Sana (SANA: Efficient High-Resolution Image Synthesis with Linear Diffusion Transformer) into the diffusers lib. Sana first makes the Text-to-Image available on 32x compressed latent space, powered by DC-AE(https://arxiv.org/abs/2410.10733v1) without performance degradation. Also, Sana contains several popular efficiency related techs, like DiT with Linear Attention processor and we use Decoder-only LLM (Gemma-2B-IT) for low GPU requirement and fast speed.
Paper: https://arxiv.org/abs/2410.10629
Original code repo: https://github.com/NVlabs/Sana
Project: https://nvlabs.github.io/Sana
Core contributor of DC-AE:
work with @[email protected]
Core library:
We want to collaborate on this PR together with friends from HF. Feel free to contact me here. Cc: @sayakpaul, @yiyixuxu
Core library:
HF projects:
-->
Images is generated by
SanaPAGPipeline
withFlowDPMSolverMultistepScheduler