Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update CI tools and dependencies #24

Merged
merged 32 commits into from
Oct 19, 2023
Merged

Update CI tools and dependencies #24

merged 32 commits into from
Oct 19, 2023

Conversation

glatterf42
Copy link
Member

Since the MESSAGE stack is moving from flake8 and isort to ruff, we will want to do the same here for consistency. Ruff can even be run via a pre-commit hook, allowing less maintenance on GHA workflow files.
While preparing this transition, I found that some of our dependencies were several (major) versions behind their current versions, most notably fastapi and pydantic. The transition is still in progress, pytest is currently still failing (but at least running again).

@glatterf42 glatterf42 added the enhancement New feature or request label Oct 3, 2023
@glatterf42 glatterf42 self-assigned this Oct 3, 2023
@glatterf42 glatterf42 marked this pull request as ready for review October 3, 2023 10:45
@glatterf42 glatterf42 marked this pull request as draft October 3, 2023 10:47
@glatterf42

This comment was marked as outdated.

@glatterf42
Copy link
Member Author

Error handling in pandera seems to have changed from version 0.13.4 to 0.17.2. Unfortunately, I couldn't find any concise changelog on things that might affect us. Safetycli has something that most resembles it and I was able to infer this as the complete 'changelog', though going through a list of all commits doesn't seem very practical either.
Reading in the docs, I stumbled upon the 'important note' that pa.SchemaModel will be deprecated soon (version 0.20) and replaced our three mentions of it with the more recent pa.DataFrameModel.

My main issue is the error handling, however. Even after several hours of searching for it, there does not seem to be a way to raise a custom exception inside a dataframe-wide validation function. Instead, pandera seems to always raise a SchemaError, at best customizable with a custom message, though I even struggled to implement this. So while I usually try to keep the changes in a way that all tests will still pass (so hopefully we won't have user-facing changes by updating our dependencies), in this case I was only able to adapt the with pytest.raises() marks for the tests in tests/core/test_iamc.py and tests/core/test_run.py that expect an InconsistentIamcType to instead expect an ixmp4.core.exceptions.SchemaError, which is still one of our custom errors, but different.

* Make creation information optional in iamc/variable
* Make id and name required for unit
* Revert bulk_* changes to use json_encoder of api.DataFrame again
* NOTE: this option will be deprecated in the future,
* pydantic serialization decorators should be used instead
* Revert use of Annotated in iamc/datapoint/filter
* Make sqla_model a proper ClassVar (rather than private attr)
* Runs mypy, black, ruff
* NOTE: needs to be run in the repo s.t. poetry detects the correct venv
* Remove separate lint.yaml action
@glatterf42
Copy link
Member Author

For the pre-commit hook, I have now opted for option 1 in this list of ways to run mypy on pre-commit. I would have liked to go with option 2, but couldn't figure out a way to run poetry's mypy from within the venv that pre-commit creates when using the mirrors-mypy hook without language:system. In particular, poetry run which mypy would point to the mypy outside the poetry venv I wanted to use. There might be a way to figure this out using a small shell script, but that would be additional maintenance effort once more, which I want to avoid.
So for now, language:system tells pre-commit to not install a venv for the mirrors-mypy hook. This means mypy and all other dependencies necessary to run it need to be present somewhere. By overwriting the execution of mypy via extra: bash -c "poetry run mypy .", poetry is now resolving mypy and corresponding packages correctly and even seems to use the existing pyproject.toml for configuration.
Through the changes in .github/workflows/pytest.yaml, poetry will be installed for this workflow job as well (which could maybe be grouped together with the installation for the other job, but we only need one python version here). So this workaround should be fine for us developers using the command line; if we install pre-commit only in the venv belonging to ixmp4, we will only ever run it inside the poetry venv where mypy also exists. If we want people to use their GUI git client, this might be a problem -- at least the article linked above suggests that pre-commit might then run without the venv being enabled. As long as it's run within the ixmp4 directory, poetry should still register everything correctly, but I have too little experience to say for sure. So this is mainly a heads-up.

In addition, this setup could allow us to employ https://github.com/floatingpurr/sync_with_poetry. This tool (after configuring its db.py file for our purposes) would automatically keep the rev numbers of black, ruff, and mypy up to date. This would leave only its own rev number for manual updates, though I don't know how big the difference is between updating fours numbers every few month compared to one number.

@glatterf42
Copy link
Member Author

As for the docs failing to build, this seems to be a long-running issue on sphinxcontrib-openapi: I'll see if I can't help them fix it.

@glatterf42
Copy link
Member Author

Using their suggested fix leads to successful builds, but the process emits a WARNING: unknown directive or role name: openapi:httpdomain and the docs look differently (specifically, this page). Thus, I wouldn't call this problem resolved.

To summarize, sphinxcontrib-openapi cannot deal with type hints of Unions of types. These get translated to the type of a schema being "anyOf"', rather than a single type`:

{"name": "table", "in": "query", "required": false, "schema": {"anyOf": [{"type": "boolean"}, {"type": "null"}], "default": false, "title": "Table"}}

In fact, the schema dict is loosing the keyword type, which sphinxcontrib-openapi required to read in the type.
This issue is well-known, there are a number of open issues and PRs in their repo mentioning it, some as much as four years old. I can try adapting these existing changes to a new PR, but I don't know if we should hold our breath for its merger. Since sphinxcontrib-openapi only seems to offer an option to include entire endpoints, which would include the majority of our endpoints by now, I am not sure what else we can do. The <type> | None type hints have become necessary due to pydantic's v2 changes away from implicit default values, at least as far as I can see.
For a field to be optional, a default value of the corresponding type or None has to be provided now, even if the type of the field is Optional[<type>]. This means a lot of default None values have been set by adapting to pydantic v2, which need to be accommodated by our endpoints' expected input and response models.
Maybe there is an entirely different way of documenting our endpoints with sphinx?

@glatterf42
Copy link
Member Author

Slight misunderstanding before, this issue persists in the openapi31 renderer. I thus created sphinx-contrib/openapi#143 to try and remedy this.

@glatterf42
Copy link
Member Author

Overview of updated dependencies

[tool.poetry.dependencies]
SQLAlchemy : { extras = ["mypy"], version = "^2.0.7" } -> {extras = ["mypy"], version = "^2.0.22"}
SQLAlchemy-Utils : "^0.40.0" -> "^0.41.1"
dask : "^2023.4.0" -> "^2023.10.0" 
fastapi : "^0.94.0" -> "^0.103.1"
httpx : { extras = ["http2"], version = "^0.23.3" } -> {extras = ["http2"], version = "^0.25.0"}
pandera : "^0.13.4" -> "^0.17.0"
pydantic : "^1.10.5" -> "^2.4.0"
python-dotenv : "^0.19.0" -> "^1.0.0"
typer : "^0.4.0" -> "^0.9.0" 

pydantic-settings = "^2.0.3"

[tool.poetry.group.docs.dependencies]
sphinx : "5.3" -> "^7.2.6" 
sphinxcontrib-openapi : "^0.8.1" -> {git = "https://github.com/glatterf42/openapi.git", branch = "enh/openapi31-anyOf"}

[tool.poetry.group.server.dependencies]
gunicorn : "^20.1.0" -> "^21.2.0"
uvicorn : { version = "^0.17.0", extras = ["standard"]} -> {extras = ["standard"], version = "^0.23.2"}

[tool.poetry.group.dev.dependencies]
black : "^23.1.0" -> "^23.9.1"
build : "^0.10.0" -> "^1.0.3" 
mypy : "~1.0.1" -> "^1.6.0"
pytest : "^6.2.5" -> "^7.4.2"
pytest-benchmark : "^3.4.1" -> "^4.0.0"
pytest-cov : "^2.12.1" -> "^4.1.0"

pandas-stubs = "^2.0.3.230814"
pre-commit = "^3.5.0"
ruff = "^0.0.292"

@glatterf42
Copy link
Member Author

Notes about enabling the docs to build

Please note that if you update your venv to the package versions in pyproject.toml and try to e.g. ixmp4 server start with the main branch checked out, you will receive errors due to the pydantic version update (and possibly others). So to compare the old style of the docs, you might want to start and keep hosting them somewhere before updating your venv or to create an entirely separate venv.

Please note that the current 'fix' for building the docs uses my fork of sphinxcontrib-openapi, so this is not approved in any way. Please check that the docs render in an acceptable form, especially lists like GET /iamc/datapoints on local server or GET /iamc/datapoints on RTD which have seen parameters change from single types to unions of types for type hints. In my browser, this change looks similar to this:

table (boolean) – -> table ({'boolean', 'null'}) –

If this style is not acceptable, we can further adapt the fork/PR to support our desired style.

Please also check the same endpoint in the Swagger UI that fastapi creates automatically: http://127.0.0.1:9000/v1/sqlite/docs#/iamc/enumerate_iamc_datapoints__get is the link for my locally hosted version (our docs use port 8000 as a default instead). FastAPI upgraded to support Openapi 3.1.0 and Swagger UI 5.x.x with version 0.99, but on my local version, type hints are only shown for parameters that have a single type. So parameters with types in anyOf do not disply information, compare e.g.

platform *
string
(path)

with

table
(query)

@glatterf42 glatterf42 requested a review from meksor October 16, 2023 13:10
meksor and others added 6 commits October 18, 2023 14:13
@meksor
Copy link
Contributor

meksor commented Oct 18, 2023

LGTM, thanks! that was a lot of work ^^'

@glatterf42 glatterf42 merged commit 68a8788 into main Oct 19, 2023
4 checks passed
@glatterf42 glatterf42 deleted the ci/update-tools branch October 19, 2023 08:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants