How much GPU memory needed in general? #8

sheng19331 · 2024-11-18T12:28:00Z

Great job! Thanks for sharing the tool.
Do you have recommendations for the GPU memory to run a prediction? I was trying to run the prediction in the examples with 4090(24GB), but it failed with 'ran out of the memory'. Is it possible to run a prediction with boltz-1 by such GPU?
Thanks.

RuikangSun · 2024-11-18T12:37:52Z

I failed with RTX 3090 in my lab, but surprisingly succeed with CPU mode.

hazirliver · 2024-11-18T13:43:13Z

I also failed on 24GB (RTX 4090), but succeeded with RTX A6000 (48GB). The peak memory consumption during the run for example run ligand.fasta was approximately 33G

jwohlwend · 2024-11-18T14:16:50Z

Hi all, yes the example file is actually fairly large. I'll make a smaller one. We'll be adding an option today to lower memory consumption with bit of slowdown as tradeoff. Will report back here when it's on the main branch!

sheng19331 · 2024-11-18T14:27:45Z

Hi all, yes the example file is actually fairly large. I'll make a smaller one. We'll be adding an option today to lower memory consumption with bit of slowdown as tradeoff. Will report back here when it's on the main branch!

Thanks for the prompt feedback. That would be great to add an option to adjust the memory consumption for the GPU with a low memory size!

aggelos-michael-papadopoulos · 2024-11-18T15:14:23Z

In my RTX 3090 takes about 11GB and it works fine

MKCarter · 2024-11-18T15:42:06Z

My RTX4090, with 16GB seems to handle this fine:

boltz predict --out_dir test/ test/ --diffusion_samples 5
Downloading data and model to /home/michael/.boltz. You may change this by setting the --cache flag.
Checking input data.
Processing input data.
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:00<00:00, 26.44it/s]
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
/home/michael/miniconda3/envs/boltz/lib/python3.12/site-packages/pytorch_lightning/trainer/connectors/logger_connector/logger_connector.py:75: Starting from v1.9.0, `tensorboardX` has been removed as a dependency of the `pytorch_lightning` package, due to potential conflicts with other packages in the ML ecosystem. For this reason, `logger=True` will use `CSVLogger` as the default logger, unless the `tensorboard` or `tensorboardX` packages are found. Please `pip install lightning[extra]` or one of them to enable TensorBoard support by default
You are using a CUDA device ('NVIDIA GeForce RTX 4090 Laptop GPU') that has Tensor Cores. To properly utilize them, you should set `torch.set_float32_matmul_precision('medium' | 'high')` which will trade-off precision for performance. For more details, read https://pytorch.org/docs/stable/generated/torch.set_float32_matmul_precision.html#torch.set_float32_matmul_precision
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]
Predicting DataLoader 0: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:27<00:00,  0.04it/s]Number of failed examples: 0
Predicting DataLoader 0: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:27<00:00,  0.04it/s]

Gives a usage level of this:

nvidia-smi
Mon Nov 18 15:40:10 2024       
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 560.35.03              Driver Version: 560.35.03      CUDA Version: 12.6     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  NVIDIA GeForce RTX 4090 ...    Off |   00000000:01:00.0 Off |                  N/A |
| N/A   42C    P0             33W /  150W |    1423MiB /  16376MiB |     50%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
                                                                                         
+-----------------------------------------------------------------------------------------+
| Processes:                                                                              |
|  GPU   GI   CI        PID   Type   Process name                              GPU Memory |
|        ID   ID                                                               Usage      |
|=========================================================================================|
|    0   N/A  N/A      2542      G   /usr/lib/xorg/Xorg                              4MiB |
|    0   N/A  N/A     90421      C   ...el/miniconda3/envs/boltz/bin/python       1380MiB |
+-----------------------------------------------------------------------------------------+

This is only running for a single protein seqeunce and ligand, but I imagine it would be fine with multiple chains and ligands.

jadolfbr · 2024-11-18T22:58:13Z

The example failed on a 24GB machine for me as well: https://instances.vantage.sh/aws/ec2/g5.2xlarge

Downloading data and model to /home/jadolfbr/.boltz. You may change this by setting the --cache flag.
Checking input data.
Processing input data.
Predicting DataLoader 0:   0%|          | 0/1 [00:00<?, ?it/s]| WARNING: ran out of memory, skipping batch
Predicting DataLoader 0: 100%|██████████| 1/1 [00:02<00:00,  0.39it/s]Number of failed examples: 1
Predicting DataLoader 0: 100%|██████████| 1/1 [00:02<00:00,  0.39it/s]

moritztng · 2024-11-20T17:18:24Z

I failed with RTX 3090 in my lab, but surprisingly succeed with CPU mode.

What was the input and how long did it take?

moritztng · 2024-11-20T17:36:24Z

The weights are only 6.5gb but when predicting with CPU it uses up to 30GB RAM. What is taking so much memory?

RuikangSun · 2024-11-24T12:46:37Z

I failed with RTX 3090 in my lab, but surprisingly succeed with CPU mode.

What was the input and how long did it take?

1 receptor and 1 ligand, maybe a quarter or a few quarters?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How much GPU memory needed in general? #8

How much GPU memory needed in general? #8

sheng19331 commented Nov 18, 2024

RuikangSun commented Nov 18, 2024

hazirliver commented Nov 18, 2024

jwohlwend commented Nov 18, 2024

sheng19331 commented Nov 18, 2024

aggelos-michael-papadopoulos commented Nov 18, 2024

MKCarter commented Nov 18, 2024

jadolfbr commented Nov 18, 2024

moritztng commented Nov 20, 2024

moritztng commented Nov 20, 2024

RuikangSun commented Nov 24, 2024

How much GPU memory needed in general? #8

How much GPU memory needed in general? #8

Comments

sheng19331 commented Nov 18, 2024

RuikangSun commented Nov 18, 2024

hazirliver commented Nov 18, 2024

jwohlwend commented Nov 18, 2024

sheng19331 commented Nov 18, 2024

aggelos-michael-papadopoulos commented Nov 18, 2024

MKCarter commented Nov 18, 2024

jadolfbr commented Nov 18, 2024

moritztng commented Nov 20, 2024

moritztng commented Nov 20, 2024

RuikangSun commented Nov 24, 2024