-
Notifications
You must be signed in to change notification settings - Fork 5.4k
Question about SDXL conversion to diffusers in convert_from_ckpt.py #8238
Replies: 1 comment · 5 replies
-
how are you comparing the outputs of text_encoder_2? Also cc @sayakpaul @DN6 |
Beta Was this translation helpful? Give feedback.
All reactions
-
I convert civitai model to diffusers using following commands
I check the output of text_encoder_2 using following code
I found text_projection.weight is transposed in https://github.com/huggingface/diffusers/blob/main/src/diffusers/pipelines/stable_diffusion/convert_from_ckpt.py#L969 |
Beta Was this translation helpful? Give feedback.
All reactions
-
I would like to ask about another difference between kohya and diffusers. the code for calculating pool_prompt_embeds from kohya and diffusers belows. Difference start from this.
|
Beta Was this translation helpful? Give feedback.
All reactions
-
Thanks for the detailed explanation. As a first clarification, I want to say that the sd-scripts repo is focused on experimental features and not that much on maintaining compatibility or sticking to the original specs of the models. That repo is newer than diffusers, so all the changes you mention are made on top of diffusers and not backwards, so probably the best person to answer why or what are those changes for is the author of the repo and not the diffusers team, I see some comments explaining them though. As for the first question, it probably doesn't matter if it's transposed if it's used accordingly in the inference code, it's just a matter of how it's used. Sadly I don't have the time to read all the kohya repo to search for it but I suggest you look also at the inference code and not just the encoding parts. As for the second question, it's almost the same response, but also the code you're looking it's a simplified version of the If you still want explanations about this, I pinged the people that may know that but sadly, this takes time and not everyone has the time for it, what I can say, is, that diffusers always stays true to the spec of the model and the papers, most of the time in collaboration with the authors and/or reviews from them. |
Beta Was this translation helpful? Give feedback.
All reactions
-
As you mention this difference can be some detailed engineering. I just wonder I can get some insights if this engineering is known things. I think it's up to me to find out meaning of this details. Thanks for your response even if this is very long question. |
Beta Was this translation helpful? Give feedback.
All reactions
-
❤️ 1
-
Hey @SeungHwa92 , I noticed the same thing and was wondering about it too. Did you manage to figure it out? |
Beta Was this translation helpful? Give feedback.
-
Hi, I'm trying to move my kohya code to diffusers. (code for training SDXL Lora)
During this work, I found that text_encoder_2's output is different.
And I found that while converting kohya model to diffusers model, some value is transposed.
https://github.com/huggingface/diffusers/blob/main/src/diffusers/pipelines/stable_diffusion/convert_from_ckpt.py#L969
Could you please tell me the reason for this?
Beta Was this translation helpful? Give feedback.
All reactions