CPU only version of molfeat #65

kkovary · 2023-06-29T04:51:09Z

Is there an existing issue for this?

I have searched the existing issues and found nothing

Bug description

I've been trying to build some application around molfeat and installing it on different systems like GitHub action runners without GPUs, old/lightweight servers, or other testing environments is currently really difficult due to the reliance on a GPU reliant version of PyTorch.

How to reproduce the bug

No response

Error messages and logs

# Error messages and logs here please

Environment

Current environment

#- Molfeat version (e.g., 0.1.0):
#- PyTorch Version (e.g., 1.10.0):
#- RDKit version (e.g., 2022.09.5): 
#- scikit-learn version (e.g.,  1.2.1): 
#- OS (e.g., Linux):
#- How you installed Molfeat (`conda`, `pip`, source):

Additional context

No response

The text was updated successfully, but these errors were encountered:

maclandrol · 2023-06-29T05:09:01Z

Hello @kkovary, installing molfeat on a system without GPU should work (see our CI) for example. Can you share the steps you are using ?

kkovary · 2023-06-29T05:51:48Z

Hi @maclandrol thanks for getting back to me. I'm wondering if the issue is arising from my team using poetry to manage dependencies and your team using mamba.

currently our pyproject.toml file looks like:

[tool.poetry]
name = "chem-transformer"
version = "0.1.0"
description = ""
authors = ["Your Name <[email protected]>"]
readme = "README.md"

[tool.poetry.dependencies]
python = "^3.10"
pydantic = "^1.10.8"
rdkit = "^2023.3.1"
molfeat = "^0.8.9"
datamol = "^0.10.3"
torch = [
     {version = "^1.13.0", markers = "sys_platform == 'macos'", optional = true},
     {version = "^1.13.0", markers = "sys_platform == 'linux'", optional = true},
 ]

[tool.poetry.group.ci.dependencies]
torch = "^1.11.0+cpu"
pytest = "^7.3.1"

[tool.poetry.dependencies.pytest]
version = "^7.3.1"
optional = true

[build-system]
requires = ["poetry-core"]
build-backend = "poetry.core.masonry.api"

This works fine for test environments, but when trying to build a docker image that will run on a CPU only system this fails due to missing CUDA libraries.

maclandrol · 2023-06-29T06:24:13Z

Thanks @kkovary, I will investigate it in the morning.

If you can share the exact error so I can try to reproduce (or find the dependency behind the issue), that would be nice. I can only see torch, since you are not installing the extra dependencies, but I will run some tests.

Can you confirm that removing molfeat from the poetry file above works ?

kkovary · 2023-06-29T06:57:46Z

Hi @maclandrol I stripped down the pyproject.toml file to remove the torch work-arounds that were included above

[tool.poetry]
name = "chem-transformer"
version = "0.1.0"
description = ""
authors = ["Your Name <[email protected]>"]
readme = "README.md"

[tool.poetry.dependencies]
python = "^3.10"
pydantic = "^1.10.8"
rdkit = "^2023.3.1"
datamol = "^0.10.3"
molfeat = "^0.8.9"

[tool.poetry.group.ci.dependencies]
pytest = "^7.3.1"

[tool.poetry.dependencies.pytest]
version = "^7.3.1"
optional = true

[build-system]
requires = ["poetry-core"]
build-backend = "poetry.core.masonry.api"

These are the errors that I'm seeing in the Github runner that we're using (ubuntu-latest):

 from chem_transformer.datamol_feats import Molecule
/home/runner/.cache/pypoetry/virtualenvs/library-enumerator-qd3v0RhZ-py3.10/lib/python3.10/site-packages/chem_transformer/datamol_feats.py:6: in <module>
    from molfeat.calc import _CALCULATORS, FP_FUNCS, get_calculator
/home/runner/.cache/pypoetry/virtualenvs/library-enumerator-qd3v0RhZ-py3.10/lib/python3.10/site-packages/molfeat/calc/__init__.py:3: in <module>
    from .cats import CATS
/home/runner/.cache/pypoetry/virtualenvs/library-enumerator-qd3v0RhZ-py3.10/lib/python3.10/site-packages/molfeat/calc/cats.py:21: in <module>
    from molfeat.utils.datatype import to_numpy
/home/runner/.cache/pypoetry/virtualenvs/library-enumerator-qd3v0RhZ-py3.10/lib/python3.10/site-packages/molfeat/utils/datatype.py:6: in <module>
    import torch
/home/runner/.cache/pypoetry/virtualenvs/library-enumerator-qd3v0RhZ-py3.10/lib/python3.10/site-packages/torch/__init__.py:228: in <module>
    _load_global_deps()
/home/runner/.cache/pypoetry/virtualenvs/library-enumerator-qd3v0RhZ-py3.10/lib/python3.10/site-packages/torch/__init__.py:189: in _load_global_deps
    _preload_cuda_deps(lib_folder, lib_name)
/home/runner/.cache/pypoetry/virtualenvs/library-enumerator-qd3v0RhZ-py3.10/lib/python3.10/site-packages/torch/__init__.py:154: in _preload_cuda_deps
    raise ValueError(f"{lib_name} not found in the system path {sys.path}")
E   ValueError: libcublas.so.*[0-9] not found in the system path ['/home/runner/work/chem-transformer/chem-transformer/apps', '/home/runner/work/chem-transformer/chem-transformer/apps/library_enumerator', '/opt/hostedtoolcache/Python/3.10.12/x64/lib/python310.zip', '/opt/hostedtoolcache/Python/3.10.12/x64/lib/python3.10', '/opt/hostedtoolcache/Python/3.10.12/x64/lib/python3.10/lib-dynload', '/home/runner/.cache/pypoetry/virtualenvs/library-enumerator-qd3v0RhZ-py3.10/lib/python3.10/site-packages']
=========================== short test summary info ============================
ERROR tests/test_enumerator.py - ValueError: libcublas.so.*[0-9] not found in the system path ['/home/runner/work/chem-transformer/chem-transformer/apps', '/home/runner/work/chem-transformer/chem-transformer/apps/library_enumerator', '/opt/hostedtoolcache/Python/3.10.12/x64/lib/python310.zip', '/opt/hostedtoolcache/Python/3.10.12/x64/lib/python3.10', '/opt/hostedtoolcache/Python/3.10.12/x64/lib/python3.10/lib-dynload', '/home/runner/.cache/pypoetry/virtualenvs/library-enumerator-qd3v0RhZ-py3.10/lib/python3.10/site-packages']

From what I can tell, the error is raised when the Python interpreter is unable to find the libcublas.so library in the system path. This library is a part of the CUDA toolkit and is required by PyTorch for GPU-accelerated operations.

The error occurs when PyTorch is being imported in the molfeat package, which is a dependency of the chem_transformer package we're developing. PyTorch tries to preload CUDA dependencies, including libcublas.so, but fails to find it in the system path.

@rhjohnstone expressed a similar request over slack here.

maclandrol · 2023-06-29T13:27:33Z

Since maybe torch 1.13, some cuda dependencies are downloaded with the pip version of pytorch. Downgrading to an older version of torch could work, but doesn't seem like a long term solution. Using conda/mamba would likely fix the issue too.

I looked around a bit, and there seems to be an history between torch and poetry:

I am not a poetry user, but can you check if these issues are relevant ?

I will try to make torch optional or add optional dependencies for [cpu-only] or [gpu] in the pyproject file. Let me know if that can give you enough flexibility.

maclandrol · 2023-07-20T21:31:09Z

@jstlaurent any inputs here ?

jstlaurent · 2023-08-11T19:16:20Z

@maclandrol and @kkovary : My apologies, it's taken me a while to get around to looking at this issue.

I'm not a Poetry expert by a long-shot, so unfortunately I wasn't able to find a good solution to your issue, @kkovary.

You can overwrite the torch variant to select the CPU version in your pyproject.toml file, like so:

[tool.poetry.dependencies]
python = "^3.10"
pydantic = "^1.10.8"
rdkit = "^2023.3.1"
datamol = "^0.10.3"
molfeat = "^0.9.2"
torch = { version = "^2.0.0", source="torch-cpu"}

[[tool.poetry.source]]
name = "PyPI"
priority = "primary"

[[tool.poetry.source]]
name = "torch-cpu"
url = "https://download.pytorch.org/whl/cpu"
priority = "supplemental"

But this will always select the CPU variant, and never the GPU enabled one.

Unfortunately, even Poetry's dependency groups can't help us here, because Poetry takes into account all dependencies, including optional ones, when resolving the package to install. So this will also always select the CPU variant:

[tool.poetry.dependencies]
python = "^3.10"
pydantic = "^1.10.8"
rdkit = "^2023.3.1"
datamol = "^0.10.3"
molfeat = "^0.9.2"

[tool.poetry.group.cpu]
optional = true

[tool.poetry.group.cpu.dependencies]
torch = { version = "^2.0.0", source="torch-cpu"}

[[tool.poetry.source]]
name = "PyPI"
priority = "primary"

[[tool.poetry.source]]
name = "torch-cpu"
url = "https://download.pytorch.org/whl/cpu"
priority = "supplemental"

Even if you run poetry install --without cpu, Poetry will conclude that torch==2.0.1+cpu is the package that satisfies every requirements.

My suggestion to you, as unpleasant as it might be, is to maintain two pyproject.toml: one default for GPU build, and another with the explicit CPU-only version, for you CPU build.

@maclandrol: To solve this upstream of Poetry users, we would probably have to manage which version of torch gets pulled in molfeat proper, using extras to have users explicitly pick CPU vs GPU options.

maclandrol · 2023-08-11T20:40:58Z

@jstlaurent I attempted a pytorch-free version but it was a bad idea overall, so maybe the cpu-only extra tag might be the solution indeed.

kkovary added the bug Something isn't working label Jun 29, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CPU only version of molfeat #65

CPU only version of molfeat #65

kkovary commented Jun 29, 2023

maclandrol commented Jun 29, 2023 •

edited

Loading

kkovary commented Jun 29, 2023 •

edited

Loading

maclandrol commented Jun 29, 2023

kkovary commented Jun 29, 2023 •

edited

Loading

maclandrol commented Jun 29, 2023 •

edited

Loading

maclandrol commented Jul 20, 2023

jstlaurent commented Aug 11, 2023

maclandrol commented Aug 11, 2023

CPU only version of molfeat #65

CPU only version of molfeat #65

Comments

kkovary commented Jun 29, 2023

Is there an existing issue for this?

Bug description

How to reproduce the bug

Error messages and logs

Environment

Additional context

maclandrol commented Jun 29, 2023 • edited Loading

kkovary commented Jun 29, 2023 • edited Loading

maclandrol commented Jun 29, 2023

kkovary commented Jun 29, 2023 • edited Loading

maclandrol commented Jun 29, 2023 • edited Loading

maclandrol commented Jul 20, 2023

jstlaurent commented Aug 11, 2023

maclandrol commented Aug 11, 2023

maclandrol commented Jun 29, 2023 •

edited

Loading

kkovary commented Jun 29, 2023 •

edited

Loading

kkovary commented Jun 29, 2023 •

edited

Loading

maclandrol commented Jun 29, 2023 •

edited

Loading