Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Packaging of Graphium 3.0 #531

Open
wants to merge 54 commits into
base: graphium_3.0
Choose a base branch
from
Open

Packaging of Graphium 3.0 #531

wants to merge 54 commits into from

Conversation

Andrewq11
Copy link
Collaborator

@Andrewq11 Andrewq11 commented Nov 6, 2024

Changelogs

  • Updated setup.py to build graphium_cpp as part of the graphium build process.
  • Small changes to setup.py to allow proper graphium_cpp compilation.
  • Addition of doxygen for graphium_cpp doc building. Configured to integrate into existing docs deployment.(@DomInvivo I would pull this branch and run mkdocs serve to ensure it all looks good to you. There is a nav menu button named "Graphium C++").
  • env.yml file updated with dependencies/versions that are compatible with graphium and graphium_cpp.
  • Updated README.md to reflect how to install 3.0.0. A note was added about PyPi version limitation.

Checklist:

  • Was this PR discussed in an issue? It is recommended to first discuss a new feature into a GitHub issue before opening a PR. Graphium 3.0 #519
  • Add tests to cover the fixed bug(s) or the new introduced feature(s) (if appropriate).
  • Update the API documentation is a new function is added, or an existing one is deleted.
  • Write concise and explanatory changelogs above.
  • If possible, assign one of the following labels to the PR: feature, fix or test (or ask a maintainer to do it for you).

discussion related to that PR

Important Notes

  • The conda-forge recipe for Graphium 3.0 is essentially ready-to-go. I just need to open a PR into the main feedstock when we are getting closer to deployment and test cross-platform builds. Branch is here - https://github.com/Andrewq11/graphium-feedstock/tree/release/graphium-3.0
  • @WenkelF is going to update the graphium documentation given that he has the most context about what features are available and not in the package. We looked today and some work was needed.
  • @WenkelF I noticed there were still some references to IPUs in graphium/trainer/metrics.py. Is this correct?
  • @DomInvivo we can only support Python versions 3.9, 3.10 and 3.11 right now. Because pyg is pinning us to numpy version 1.25.0, we cannot use 3.12 (not compatible with that version of numpy). Once pyg makes a new release, I will update our conda-forge recipe to build for 3.12.
  • Please note the changes in tests/test_datamodule.py. I had to update two assert statements to get passing tests. Was this update correct or did I just make a failing test pass?
  • This PR looks much bigger than it is. >90% of the files changed are auto-generated by doxygen.

@Andrewq11 Andrewq11 added the ASAP priority Major thing to do or fix label Nov 6, 2024
@Andrewq11 Andrewq11 requested a review from WenkelF November 6, 2024 22:36
@Andrewq11 Andrewq11 self-assigned this Nov 6, 2024
@Andrewq11 Andrewq11 requested a review from DomInvivo as a code owner November 6, 2024 22:36
@Andrewq11 Andrewq11 mentioned this pull request Nov 6, 2024
28 tasks
Copy link
Collaborator

@DomInvivo DomInvivo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have made a few minor requests.
Also, I feel that the Graphium-cpp main page should not be empty. Perhaps a text stating:

Welcome to the documentation for the C++ API of the graphium library. For the python API, go to this linkhttps://graphium-docs.datamol.io/stable/). The C++ API is used to significantly accelerate the computation of positional and structural encodings, but also the caching of the supervised labels into very light and optimized files. This significantly accelerates the dataloading and removes the need for long pre-processing of large datasets.

image

README.md Outdated
### conda-forge
conda-forge is the recommended method for installing Graphium. To install Graphium via conda-forge, run the following command:
```
conda install graphium
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
conda install graphium
mamba install graphium -c conda-forge

env.yml Outdated
- tqdm
- platformdirs

# scientific
- numpy == 1.26.4
- numpy=1.25.0
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- numpy=1.25.0
- numpy <= 1.25.0

Too strict

env.yml Outdated
- mup
- pytorch_sparse >=0.6
- pytorch_cluster >=1.5
- pytorch_scatter >=2.0

# chemistry
- rdkit == 2024.03.4
- rdkit=2024.03.4
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Too strict. Can you do <= or >=?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll run a test build with <= and report back.

@Andrewq11 Andrewq11 requested a review from DomInvivo November 13, 2024 13:51
@Andrewq11
Copy link
Collaborator Author

@DomInvivo applied your suggestions and got the tests running for all operating systems we'll support. It's ready for another quick round of review. Please remember to build the docs and checkout the changes I made to the C++ API docs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ASAP priority Major thing to do or fix
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants