Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Running 'example. py' in a Docker container can cause the system to freeze or be forced to shut down #22

Open
ziyaxuanyi opened this issue Nov 13, 2024 · 4 comments

Comments

@ziyaxuanyi
Copy link

I am running it in a Docker container and nunkaku has been successfully compiled.

I replaced the schnell model in example. py with the dev model and ran it.
When the program reaches the prompt to load svdq-int4-flux.1-dev.safedetectors and outputs' Done '.
The system is stuck or forced to shut down

Warning of very high system memory usage detected when stuck or forced to shut down

May I ask if the current version of the code requires extremely high memory usage and approximately how much memory is needed
Or whether there are memory management or leakage issues

@nitinmukesh
Copy link

nitinmukesh commented Nov 13, 2024

Are you facing this issue all the time or it is happening occasionally.

I am running on 8GB VRAM / Windows 11 and it happens randomly. It happens during loading of model, if model loads inference is smooth.

@ziyaxuanyi
Copy link
Author

Are you facing this issue all the time or it is happening occasionally.

I am running on 8GB VRAM / Windows 11 and it happens randomly. It happens during loading of model, if model loads inference is smooth.

always.

@nitinmukesh
Copy link

nitinmukesh commented Nov 13, 2024

Try this
After this code add cpu_offload.

pipeline = nunchaku_flux.from_pretrained(
        "black-forest-labs/FLUX.1-dev",
        torch_dtype=torch.bfloat16,
        qmodel_path="mit-han-lab/svdquant-models/svdq-int4-flux.1-dev.safetensors",
    )

pipeline.enable_sequential_cpu_offload()

@ziyaxuanyi
Copy link
Author

Try this After this code add cpu_offload.

pipeline = nunchaku_flux.from_pretrained(
        "black-forest-labs/FLUX.1-dev",
        torch_dtype=torch.bfloat16,
        qmodel_path="mit-han-lab/svdquant-models/svdq-int4-flux.1-dev.safetensors",
    )

pipeline.enable_sequential_cpu_offload()

But I don't think it's a problem with the VRAM. My VRAM has 24GB, which is sufficient. It's likely a problem with the CPU

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants