Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Question] Is Stable Diffusion pretrained model running in frozen together with prolific dreamer? #317

Closed
dedoogong opened this issue Oct 7, 2023 · 4 comments

Comments

@dedoogong
Copy link

Hello!
I wonder how the prolific dreamer works, please give me some hints!

  1. Is Stable Diffusion pretrained model running in frozen together with prolific dreamer(If So when the model runs in the training? I tried to debug the pipeline and I found it maybe related to pipe(pipe = StableDiffusionPipeline.from_pretrained) or self.submodules in threestudio/models/guidance/stable_diffusion_vsd_guidance.py, but I only see below property functions run in every iteration implicitly(I mean it runs in diffuser library code, not in threestudio code explicitly)
    "def pipe(self):
    return self.submodules.pipe

    @Property
    def pipe_lora(self):
    return self.submodules.pipe_lora

    @Property
    def unet(self):
    return self.submodules.pipe.unet

    @Property
    def unet_lora(self):
    return self.submodules.pipe_lora.unet

    @Property
    def vae(self):
    return self.submodules.pipe.vae

    @Property
    def vae_lora(self):
    return self.submodules.pipe_lora.vae
    "

  2. Is SD pretrained model providing 2D image on every iteration ? or just 4 images for 4 views(front/back/side/overhead) at the first iteration?

If SD pretrained model generates 2D images of 4 views on every iteration, I think Nerf Can't be trained well as the images keep changed little bit because of SD's un-consistency.

  1. The default SD based prolific dreamer fails to generate human as I want( I want to generate A or T posed human but I can't). So I want to replace SD with my custom Controlnet(using openpose or hed or canny). Should I replace threestudio/models/guidance/stable_diffusion_vsd_guidance.py with a new vsd guidance of controlnet version by referencing "threestudio/models/guidance/controlnet_guidance.py ?

Thank you very very much!

@DSaurus
Copy link
Collaborator

DSaurus commented Oct 15, 2023

Hi @dedoogong , Here are my responses to your queries:

  1. The SD pre-trained model is frozen because we set its requires_grad to false. We only train LORA layers in prolificdreamer.
  2. The images are provided by randomly rendering from various viewpoints at each iteration. The quantity of images matches the batch size.
  3. Currently, I've observed that vsd+controlnet isn't optimal for 3D generation, particularly for human subjects. You can find more details in this pull request(Controlnet vsd #279). Feel free to check it out!

@dedoogong
Copy link
Author

dedoogong commented Oct 15, 2023

oh thanks! I didn't know there is already implemented one~! I've also been trying to write my contorlnet vsd version last week and almost done. I found my implementation is almost same as the branch you mentioned ;)

Even though, regarding controlnet version for human chracter creation, multiple controlnet is the MUST from my experience.

I tried to generate normal 2D multiple controlnet using Openpose, Normal, Depth, Lineart and surely it was better than one controlnet.

So, if possible I hope @yankeesong could extend his version to multiple version too ;) if not maybe I will try.
By the way is there a config file for me to try your or his Controlnet-vsd guidance?
there is only 2D test version for debugging.

When I applied guidance_type: "stable-diffusion-controlnet-vsd-guidance" in the config, it shows ControlNetVSDGuidance.call() missing 1 required positional argument: 'prompt_utils' error.
And I found that the system really doesn't pass the condition image. Is the branch really implemented correctly?

@yankeesong
Copy link
Collaborator

Hi @dedoogong, thanks for your interest!

When I implemented and tested controlnet-vsd I also found that it doesn't offer obvious improvement over controlnet+SDS (i.e. VSD doesn't quite help). Thus I would recommend simply using the basic controlnet version. Adding more control images doesn't necessarily mean you need to use VSD either.

I'm sorry if there are bugs in the branch (it worked fine when I pushed, but it might not be thoroughly tested). Unfortunately I'm recently busy with other works and not able to contribute further. You might need to implement yourself if needed. Good luck!

@dedoogong
Copy link
Author

Hi @yankeesong !
Thanks for your kind reply!
I fortunately could run your code by modifying the code little bit ;)
Yes I also saw the Controlnet-VSD result is not that better than SDS even though it seems better than the original SD-VSD.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants