-
Notifications
You must be signed in to change notification settings - Fork 22
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: add version constraints, poetry, and poetry.lock
file, and use nox-poetry
and py3.12 in CI
#202
Conversation
move pytest-cov command line args to noxfile so they don't always run
poetry.lock
file, and use nox-poetry
in CIpoetry.lock
file, and use nox-poetry
and py3.12 in CI
fix: add version constraints
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the changes! I appreciate that we won't unknowingly break support for lower package versions. I don't mind bumping our dependency requirements up regularly if we want their newer features, but it should be done in a documented way, and not produce bugs post-install.
I also appreciate the code coverage test being split into a separate test, that's cleaner.
Despite your good descriptions, I don't actually understand much about what poetry is doing. I think that's the right effort/outcome tradeoff, so I'll do this review without that knowledge. I've also never understood much of the CI configuration here, so cannot review that critically.
In general, I lean towards it being more important that trtools is easy to write tests for, and that it is testable when it is installed, than it is to reduce file sizes. Happy to discuss, perhaps that's not the correct heuristic. But my comments reflect that.
The CI check that all new files are smaller than 0.5 MB should be a warning but able to be overridden. Sometimes the cleanest thing to do for writing a test is to upload a 10MB VCF and write a test against it. That may not be best practice, but it is something I want to support.
Excluding trtools/testsupport/* includes excluding trtools/testsupport/test_trtools.sh, which means installs of trtools will not come with working tests. I don't particularly like that solution.
This PR adds version constraints and a lock file to our repository. It also adds poetry (a dependency management tool) to our development environment, so that we can use it to maintain our lock file. Finally, it reconfigures our CI to use nox-poetry, allowing us to run our CI checks with the locked versions of our dependencies.
The code in this PR was adapted from the haptools repository which was, in turn, inspired by the Hypermodern Python series.
a new development setup
This PR makes some changes to our development workflow. First, it changes the dev installation instructions on the README. Rather than installing the project using pip, it now says to use poetry. It also provides a conda environment (in the
dev-env.yml
file) to ensure that the right poetry version is installed. Thedev-env.yml
file also includes nox and some other packages that we require for testing and development.Within the conda environment, poetry creates its own separate virtual environment with TRTools and all of its dependencies, using the versions in our lock file. To enter it, you can run
poetry shell
. And then you can continue coding as if you had just done an editable install (withpip install -e
)It's important that this child environment remain separate from its parent conda environment to reduce the chance of version conflicts between the environments. Some packages installed in the conda environment, like bcftools and ART, will be available in the child environment, but the same won't be true for the python packages like poetry or nox.
To add or modify a dependency of TRTools, we can use the
poetry add
command, documented in the Contributing section of our README. This will install the new dependency in the child environment and update the pyproject.toml and poetry.lock files accordingly.why should we use poetry?
To see how poetry might prevent us from breaking support for older python versions, let's consider an example. Let's say we have pandas >= 2.0.3 in our pyproject.toml and pandas is locked to 2.0.3 in our lock file. We want to develop some code that uses a new function/feature of pandas. But unbeknownst to us, the feature is only available in pandas 2.1.0+, which is also the first version of pandas to drop support for python 3.8. (In fact, a situation like this has already happened for python 3.6 in #203 !)
If we weren't using poetry, we would probably use the new function in our code and unknowingly break support for python 3.8. So how does poetry help in this situation? In two ways.
Our development environment is configured to use only the locked versions of our dependencies. This ensures that all developers will use the minimum supported versions of each of our dependencies. So when we try to run our code that uses the new pandas feature, it will fail, since we will have pandas 2.0.3 installed locally instead of 2.1.0
Once we figure out that we need a newer version of pandas, we might try to install pandas 2.1.0 via
poetry add 'pandas>=2.1.0'
. At that moment, poetry should complain and warn us that pandas 1.2.0 has dropped support for python 3.8. This forces us to either purposefully drop support for python 3.8 or come up with a workaroundTo summarize, we should use poetry because 1) it helps us maintain a lock file and 2) it will complain about any incompatibilities between the versions of our dependencies in our pyproject.toml and help us choose the proper versions.
why should we use nox-poetry?
A developer could accidentally subvert poetry's checks and forget to update the lock file by installing pandas 2.1.0 with pip instead of poetry. Our CI should catch that. It uses nox-poetry to ensure that our tests are executed with the locked versions of our dependencies. nox also allows us to locally run our tests against each supported version of python.
Using the locked versions of our dependencies in the CI also protects our CI from any potential regressions that may be introduced in our dependencies. (For examples of this happening, see #206 and #208.) Unfortunately, I wasn't able to lock our dependencies for python 3.9+ without dropping support for python 3.7 because of cjolowicz/nox-poetry#1116 . So our CI tests for py3.9+ are still prone to breakage whenever a regression is introduced. In the future, we'll able to resolve this by using multiple constraint dependencies to maintain different versions of the same package in our lock file. That approach won't currently work because of python-poetry/poetry-plugin-export#183, but once that gets resolved, I will open a new PR to lock our dependencies for python 3.9+, as well. In the meantime, successful checks for python 3.9+ won't be required for PRs.
There are other benefits to using nox and nox-poetry, but it might be best if you just refer to their docs for the rest.
additional CI checks
This PR also adds two new CI checks:
A test for python 3.12
Confirmation that all new files are smaller than 0.5 MB
Up till now, all of the CI checks would fail if the PR's code coverage was insufficient. To fix this, I've separated the coverage test into its own CI check so that the other tests will still pass. This should hopefully make our CI checks easier to interpret. Also, just a (somewhat unrelated) heads-up that I moved the contents of
.coveragerc
into our pyproject.toml file.By default, poetry will include all files in the trtools/ folder within the distribution published to PyPI. This bloats the sdist similarly to fix: bloating of sdists by
setuptools-scm
#195. Poetry's recommended solution is to exclude directories from the sdist by listing them in the[tool.poetry.exclude]
section of our pyproject.toml file. But this means that the sdist might get bloated again if a developer adds a new directory (with files that should be excluded) but forgets to list the directory in the pyproject.toml file. To prevent this, I've added a test to check that the distribution folder never exceeds 0.3 MB.Note: I've excluded the entire testsupport directory from the source distribution, but I'm not sure that's ok? The python files in that directory weren't excluded until now, but I don't think they're needed by anything, since the test_trtools.sh script seems to download its own version of the code.