I just combined one of my Stable Diffusion safetensors with the video safetensors from text-to-video-ms-1.7b #46
Replies: 4 comments 2 replies
-
Give us a link to the model? |
Beta Was this translation helpful? Give feedback.
-
Absolutely, post it on r/StableDiffusion. They love such experiments |
Beta Was this translation helpful? Give feedback.
-
goth_girl_miniskirt.mp4 |
Beta Was this translation helpful? Give feedback.
-
I did some experimenting with this and wasn't able to find anything interesting. I created 3 different merges between text2video_pytorch_model.pth and Deliberate v2. The .pth file is in CKPT format, not Safetensors. I used Weighted Sum, Don't Copy Config, Bake In VAE: None. For sampling, the prompt is "a raccoon walking in the forest, DSLR, detailed", negative prompt: "blurry, drawing", 30 steps, CFG 12.5, 30 frames, seed 1, 256 x 256, video is 5 FPS, and CRF is 12. The resulting models have different checksums from each other, and are all 5.25 GB like the original model. https://user-images.githubusercontent.com/33569918/227389067-3eafdb52-312b-4b61-ad34-4802ca2004fd.mp4 https://user-images.githubusercontent.com/33569918/227389129-6b0e7370-1ff2-457e-acb3-bfac6dd99e6a.mp4 https://user-images.githubusercontent.com/33569918/227389153-870924c5-af6a-469b-ac23-f212d1ac931a.mp4 https://user-images.githubusercontent.com/33569918/227389180-74013056-f139-40d8-9ceb-ef2971e38c46.mp4 The videos are not technically identical (they have different checksums), but they are nearly visually identical. I used Photoshop to compare the first frame from the videos created by the merged models to the first frame from the video created by the original model. I used the "Difference" color blending mode:
As you can see there is no visible difference between the frames. These images aren't pure black; there are random pixels with values like "010000" instead of "000000". But they are basically identical.
|
Beta Was this translation helpful? Give feedback.
-
It didn't break it, and I'm not entirely sure what I was expecting, but the output is... good?
(I know nothing about how safetensors or made, what they consist of, or what combining them does, and I just wanted to see what would happen. Added a 70/30 mix with the txt2vid and my... octopus?... model, and... yeah.)
Just thought I'd throw it out there. Not quite sure where else to post something like this haha
(The auto1111 extension is working well. Thanks!)
Beta Was this translation helpful? Give feedback.
All reactions