Why is SD3 training code doesnt include text encoder training (with lora on top) like SDXL ????? #8590

axel578 · 2024-06-16T19:36:08Z

axel578
Jun 16, 2024

I mean, if the concept is slightly different than what the model was trained on, it trains very badly without text encoder lora training.
(I mean only the clip text encoder, please dont add lora training on top of T5XXL)

xiankgx · 2024-07-03T02:56:50Z

xiankgx
Jul 3, 2024

The mapping of condition (text) (and noise) to image happens in the diffusion model (Unet or DiT). Also, the CLIP text encoder is pretrained on massive datasets (LAION, OpenAI 400M, etc).

0 replies

asomoza · 2024-07-06T03:11:40Z

asomoza
Jul 6, 2024
Maintainer

I know it's a late answer, but Just in case someone reads later, we added the option to train the text encoders with SD3 Lora training, you can see an example here.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Why is SD3 training code doesnt include text encoder training (with lora on top) like SDXL ????? #8590

{{title}}

Replies: 2 comments

{{title}}

{{title}}

Select a reply

Why is SD3 training code doesnt include text encoder training (with lora on top) like SDXL ????? #8590

axel578 Jun 16, 2024

Replies: 2 comments

xiankgx Jul 3, 2024

asomoza Jul 6, 2024 Maintainer

axel578
Jun 16, 2024

xiankgx
Jul 3, 2024

asomoza
Jul 6, 2024
Maintainer