Experimental style & composition transfer for SD15 #433
Replies: 4 comments 3 replies
-
this might interest you @xiaohu2015 and @haofanwang |
Beta Was this translation helpful? Give feedback.
-
A long time ago, when the MBW merge model was used, there were many studies on the SD blocks. https://civitai.com/articles/76/bdsqlsz-lora-training-advanced-tutorial1lora-block-training The style is almost concentrated in out5~out11, while the layout is related to the object occupying the canvas resolution. The closer to the mid block, the more emphasis is placed on the overall layout, and the further away from the mid block, the more attention is paid to details. |
Beta Was this translation helpful? Give feedback.
-
感谢您的付出,虽然我看不懂理论,通过您的图像,我并没有在SD15上复刻您所呈现的效果。 |
Beta Was this translation helpful? Give feedback.
-
经过几天有限的尝试,能够在SD15上,做出姿态的迁移,也能做出风格迁移,但是“把美国队长的衣服纹理贴在跑车上”,这真是个很有难度的事情,可能是我工作的方向错了,我将继续尝试。 |
Beta Was this translation helpful? Give feedback.
-
I started working on style and composition transfer for SD1.5 and I've pushed an experimental version that you can play with already. Please note that you may need to reload your workflow and select the weight type again.
Working with SD1.5 layers seems a bit more complicated so things might change in the coming days based also on your feedback.
The all-in-one Style & Composition node doesn't work for SD1.5 at the moment but you can apply either style or composition with the Advanced node (and style with the simple ipadapter node).
Suggestions: play with the weight! Around 1.2 seems a good starting point. The text prompt is very important, more important than with SDXL. Use sensible prompting and increase words/tokens weight if necessary.
The style seems to work pretty well, in the pictures below you can see the difference between SDXL and SD1.5 (zoom in).
Style
While it's true that there are some layers more specialized to the style, in SD1.5 applying the reference image to only those layers doesn't seem enough to get a strong style transfer. After a lot of testing I ended up with the following combination:
Applying the style to all of them grants a strong style without bleeding of the content. Interestingly while experimenting I've been also able to identify layers for faces and apparently age.
The ending output blocks don't seem to influence much the image but they don't bleed the contents so I'm keeping them for now.
Since I'm using such a wide spectrum of layers applying Style+Composition together doesn't work very well (as we are basically touching all layers). I will try more combinations in the coming days.
Composition / layout
The layout is a bit more complicated. Input 8 (idx 5) seems the strongest but without a little help from the input 7 it doesn't work very well. If in_7 is too strong though it overrides the text prompting almost completely. So I'm applying full strength to in_8 and weight*0.25 to in_7. That seems to work for now.
Beta Was this translation helpful? Give feedback.
All reactions