Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Not utilizing gpu during image generation #428

Open
Shobhit043 opened this issue Oct 2, 2024 · 9 comments
Open

Not utilizing gpu during image generation #428

Shobhit043 opened this issue Oct 2, 2024 · 9 comments

Comments

@Shobhit043
Copy link

hi, thank you for providing this code.
i am currently running the model schnell q2 in kaggle notebook but when it start generating the image it always shows 'using cpu backend' and it does not utilize the gpu at all. pls help

input cli:
!/kaggle/working/stable-diffusion.cpp/build/bin/sd
--diffusion-model /kaggle/working/flux1-schnell-q2_k.gguf
--clip_l /kaggle/working/clip_l.safetensors
--t5xxl /kaggle/working/t5xxl_fp8_e4m3fn.safetensors
--vae /kaggle/working/ae.safetensors
-p "A male model standing confidently against a clean white background, wearing a fitted blue t-shirt and stylish black jeans. The model has a friendly smile, with short dark hair, and is posing casually with one hand in his pocket. The lighting is bright and even, highlighting the clothing details and creating a professional e-commerce look."
--cfg-scale 1.0
--sampling-method euler
--rng cuda
--steps 2
-v

verbose:
[DEBUG] stable-diffusion.cpp:180 - Using CPU backend
[INFO ] stable-diffusion.cpp:202 - loading clip_l from '/kaggle/working/clip_l.safetensors'
[INFO ] model.cpp:793 - load /kaggle/working/clip_l.safetensors using safetensors format
[DEBUG] model.cpp:861 - init from '/kaggle/working/clip_l.safetensors'
[INFO ] stable-diffusion.cpp:209 - loading t5xxl from '/kaggle/working/t5xxl_fp8_e4m3fn.safetensors'
[INFO ] model.cpp:793 - load /kaggle/working/t5xxl_fp8_e4m3fn.safetensors using safetensors format
[DEBUG] model.cpp:861 - init from '/kaggle/working/t5xxl_fp8_e4m3fn.safetensors'
[INFO ] stable-diffusion.cpp:216 - loading diffusion model from '/kaggle/working/flux1-schnell-q2_k.gguf'
[INFO ] model.cpp:790 - load /kaggle/working/flux1-schnell-q2_k.gguf using gguf format
[DEBUG] model.cpp:807 - init from '/kaggle/working/flux1-schnell-q2_k.gguf'
WARNING: Behavior may be unexpected when allocating 0 bytes for ggml_calloc!
[INFO ] stable-diffusion.cpp:223 - loading vae from '/kaggle/working/ae.safetensors'
[INFO ] model.cpp:793 - load /kaggle/working/ae.safetensors using safetensors format
[DEBUG] model.cpp:861 - init from '/kaggle/working/ae.safetensors'
[INFO ] stable-diffusion.cpp:235 - Version: Flux Schnell
[INFO ] stable-diffusion.cpp:266 - Weight type: f16
[INFO ] stable-diffusion.cpp:267 - Conditioner weight type: f16
[INFO ] stable-diffusion.cpp:268 - Diffusion model weight type: q2_K
[INFO ] stable-diffusion.cpp:269 - VAE weight type: f32
[DEBUG] stable-diffusion.cpp:271 - ggml tensor size = 400 bytes
[DEBUG] clip.hpp:171 - vocab size: 49408
[DEBUG] clip.hpp:182 - trigger word img already in vocab
[DEBUG] ggml_extend.hpp:1046 - clip params backend buffer size = 235.06 MB(RAM) (196 tensors)
[DEBUG] ggml_extend.hpp:1046 - t5 params backend buffer size = 9083.77 MB(RAM) (219 tensors)
[DEBUG] ggml_extend.hpp:1046 - flux params backend buffer size = 3732.51 MB(RAM) (776 tensors)
[DEBUG] ggml_extend.hpp:1046 - vae params backend buffer size = 94.57 MB(RAM) (138 tensors)
[DEBUG] stable-diffusion.cpp:398 - loading weights
[DEBUG] model.cpp:1530 - loading tensors from /kaggle/working/clip_l.safetensors
[DEBUG] model.cpp:1530 - loading tensors from /kaggle/working/t5xxl_fp8_e4m3fn.safetensors
[INFO ] model.cpp:1685 - unknown tensor 'text_encoders.t5xxl.encoder.embed_tokens.weight | f8_e4m3 | 2 [4096, 32128, 1, 1, 1]' in model file
[DEBUG] model.cpp:1530 - loading tensors from /kaggle/working/flux1-schnell-q2_k.gguf
[DEBUG] model.cpp:1530 - loading tensors from /kaggle/working/ae.safetensors
[INFO ] stable-diffusion.cpp:482 - total params memory size = 13145.92MB (VRAM 0.00MB, RAM 13145.92MB): clip 9318.83MB(RAM), unet 3732.51MB(RAM), vae 94.57MB(RAM), controlnet 0.00MB(VRAM), pmid 0.00MB(RAM)
[INFO ] stable-diffusion.cpp:501 - loading model from '' completed, taking 53.47s
[INFO ] stable-diffusion.cpp:518 - running in Flux FLOW mode
[DEBUG] stable-diffusion.cpp:572 - finished loaded file
[DEBUG] stable-diffusion.cpp:1378 - txt2img 512x512
[DEBUG] stable-diffusion.cpp:1127 - prompt after extract and remove lora: "a lovely cat holding a sign says 'flux.cpp'"
[INFO ] stable-diffusion.cpp:655 - Attempting to apply 0 LoRAs
[INFO ] stable-diffusion.cpp:1132 - apply_loras completed, taking 0.00s
[DEBUG] conditioner.hpp:1036 - parse 'a lovely cat holding a sign says 'flux.cpp'' to [['a lovely cat holding a sign says 'flux.cpp'', 1], ]
[DEBUG] clip.hpp:311 - token length: 77
[DEBUG] t5.hpp:397 - token length: 256
[DEBUG] ggml_extend.hpp:998 - t5 compute buffer size: 68.25 MB(RAM)
[DEBUG] conditioner.hpp:1155 - computing condition graph completed, taking 39122 ms
[INFO ] stable-diffusion.cpp:1256 - get_learned_condition completed, taking 39127 ms
[INFO ] stable-diffusion.cpp:1279 - sampling using Euler method
[INFO ] stable-diffusion.cpp:1283 - generating image: 1/1 - seed 42
[DEBUG] ggml_extend.hpp:998 - flux compute buffer size: 397.27 MB(RAM)
|==================================================| 2/2 - 244.56s/it
[INFO ] stable-diffusion.cpp:1315 - sampling completed, taking 490.85s
[INFO ] stable-diffusion.cpp:1323 - generating 1 latent images completed, taking 491.69s
[INFO ] stable-diffusion.cpp:1326 - decoding 1 latents
[DEBUG] ggml_extend.hpp:998 - vae compute buffer size: 1664.00 MB(RAM)
[DEBUG] stable-diffusion.cpp:987 - computing vae [mode: DECODE] graph completed, taking 49.77s
[INFO ] stable-diffusion.cpp:1336 - latent 1 decoded, taking 49.77s
[INFO ] stable-diffusion.cpp:1340 - decode_first_stage completed, taking 49.77s
[INFO ] stable-diffusion.cpp:1449 - txt2img completed in 580.60s
save result image to 'output.png'

@grauho
Copy link
Contributor

grauho commented Oct 2, 2024

Did you compile with the appropriate settings for your GPU?

@Shobhit043
Copy link
Author

!cmake .. -DSD_CUBLAS=ON
!cmake --build . --config Release

is this necessary to enable gpu utilization @grauho?

@grauho
Copy link
Contributor

grauho commented Oct 2, 2024

Yes, if you're trying to use it with a CUDA enabled graphics card you do want to build it with:
cmake .. -DSD_CUBLAS=ON
cmake --build . --config Release
as well as making sure you have the rest of the CUDA tool chain set up like it says in the README.

@Shobhit043
Copy link
Author

@grauho thanks for the help. By CUDA tool chain set you mean CUDA toolkit?
as im using kaggle notebook CUDA toolkit is properly installed in it.

@grauho
Copy link
Contributor

grauho commented Oct 2, 2024

@grauho thanks for the help. By CUDA tool chain set you mean CUDA toolkit? as im using kaggle notebook CUDA toolkit is properly installed in it.

No problem. Yep the CUDA toolkit, and provided it builds without error give it a shot and see if it recognizes the GPU properly.

@VarunJoshi10
Copy link

@grauho i have tried
cmake .. -DSD_CUBLAS=ON
cmake --build . --config Release
and it gives error
/home/wiredhikari/flux-api/stable-diffusion.cpp/model.cpp:705:0: required from here /usr/include/c++/13/bits/stl_tree.h:2131:14: internal compiler error: Segmentation fault 2131 | return _Res(__j._M_node, 0); | ^~~~~~~~~~~~~~~~~~~~ 0x75fe5424531f ??? ./signal/../sysdeps/unix/sysv/linux/x86_64/libc_sigaction.c:0 0x75fe5422a1c9 __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58 0x75fe5422a28a __libc_start_main_impl ../csu/libc-start.c:360 Please submit a full bug report, with preprocessed source (by using -freport-bug). Please include the complete backtrace with any bug report. See <file:///usr/share/doc/gcc-13/README.Bugs> for instructions. gmake[2]: *** [CMakeFiles/stable-diffusion.dir/build.make:76: CMakeFiles/stable-diffusion.dir/model.cpp.o] Error 1 gmake[1]: *** [CMakeFiles/Makefile2:167: CMakeFiles/stable-diffusion.dir/all] Error 2 gmake: *** [Makefile:136: all] Error 2
the model is not able to use gpu
how can i fix this?

@grauho
Copy link
Contributor

grauho commented Oct 4, 2024

@grauho i have tried cmake .. -DSD_CUBLAS=ON cmake --build . --config Release and it gives error /home/wiredhikari/flux-api/stable-diffusion.cpp/model.cpp:705:0: required from here /usr/include/c++/13/bits/stl_tree.h:2131:14: internal compiler error: Segmentation fault 2131 | return _Res(__j._M_node, 0); | ^~~~~~~~~~~~~~~~~~~~ 0x75fe5424531f ??? ./signal/../sysdeps/unix/sysv/linux/x86_64/libc_sigaction.c:0 0x75fe5422a1c9 __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58 0x75fe5422a28a __libc_start_main_impl ../csu/libc-start.c:360 Please submit a full bug report, with preprocessed source (by using -freport-bug). Please include the complete backtrace with any bug report. See <file:///usr/share/doc/gcc-13/README.Bugs> for instructions. gmake[2]: *** [CMakeFiles/stable-diffusion.dir/build.make:76: CMakeFiles/stable-diffusion.dir/model.cpp.o] Error 1 gmake[1]: *** [CMakeFiles/Makefile2:167: CMakeFiles/stable-diffusion.dir/all] Error 2 gmake: *** [Makefile:136: all] Error 2 the model is not able to use gpu how can i fix this?

That looks like a different issue, does it build normally without SD_CUBLAS enabled?

@shobhit6702
Copy link

@grauho I actually tried running CUBLAS and it took alot of time to compile and in the end said failed.
I think there could be some issue with the code.

@VarunJoshi10
Copy link

@grauho i have tried cmake .. -DSD_CUBLAS=ON cmake --build . --config Release and it gives error /home/wiredhikari/flux-api/stable-diffusion.cpp/model.cpp:705:0: required from here /usr/include/c++/13/bits/stl_tree.h:2131:14: internal compiler error: Segmentation fault 2131 | return _Res(__j._M_node, 0); | ^~~~~~~~~~~~~~~~~~~~ 0x75fe5424531f ??? ./signal/../sysdeps/unix/sysv/linux/x86_64/libc_sigaction.c:0 0x75fe5422a1c9 __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58 0x75fe5422a28a __libc_start_main_impl ../csu/libc-start.c:360 Please submit a full bug report, with preprocessed source (by using -freport-bug). Please include the complete backtrace with any bug report. See <file:///usr/share/doc/gcc-13/README.Bugs> for instructions. gmake[2]: *** [CMakeFiles/stable-diffusion.dir/build.make:76: CMakeFiles/stable-diffusion.dir/model.cpp.o] Error 1 gmake[1]: *** [CMakeFiles/Makefile2:167: CMakeFiles/stable-diffusion.dir/all] Error 2 gmake: *** [Makefile:136: all] Error 2 the model is not able to use gpu how can i fix this?

That looks like a different issue, does it build normally without SD_CUBLAS enabled?

Yeah it was build perfectly without SD_CUBLAS enabled but after that it is not able to build. And I am not able to run the model on my gpu.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants