Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DM-45281: Add Pydantic types for Postgres and Redis DSNs #270

Merged
merged 1 commit into from
Jul 18, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions changelog.d/20240717_101902_rra_DM_45281_queue.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
### New features

- Add new types `safir.pydantic.EnvAsyncPostgresDsn` and `safir.pydantic.EnvRedisDsn`, which validate PostgreSQL and Redis DSNs but rewrite them based on the environment variables set by tox-docker. Programs using these types for their configuration will therefore automatically honor tox-docker environment variables when running the test suite. `EnvAsyncPostgresDsn` also enforces that the scheme of the DSN is compatible with asyncpg and the Safir database support.
1 change: 1 addition & 0 deletions docs/_rst_epilog.rst
Original file line number Diff line number Diff line change
Expand Up @@ -23,4 +23,5 @@
.. _structlog: https://www.structlog.org/en/stable/
.. _templatekit: https://templatekit.lsst.io
.. _tox: https://tox.wiki/en/latest/
.. _tox-docker: https://tox-docker.readthedocs.io/en/latest/
.. _Uvicorn: https://www.uvicorn.org/
2 changes: 2 additions & 0 deletions docs/documenteer.toml
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,8 @@ nitpick_ignore = [
["py:class", "BaseModel"],
# sphinx-automodapi apparently doesn't recognize TypeAlias as an object
# that should have generated documentation, even with include-all-objects.
["py:obj", "safir.pydantic.EnvAsyncPostgresDsn"],
["py:obj", "safir.pydantic.EnvRedisDsn"],
["py:obj", "safir.pydantic.HumanTimedelta"],
["py:obj", "safir.pydantic.SecondsTimedelta"],
]
Expand Down
8 changes: 6 additions & 2 deletions docs/user-guide/arq.rst
Original file line number Diff line number Diff line change
Expand Up @@ -50,13 +50,14 @@ If your app uses a configuration system like ``pydantic.BaseSettings``, this exa
from urllib.parse import urlparse

from arq.connections import RedisSettings
from pydantic import Field, RedisDsn
from pydantic import Field
from pydantic_settings import BaseSettings
from safir.arq import ArqMode
from safir.pydantic import EnvRedisDsn


class Config(BaseSettings):
arq_queue_url: RedisDsn = Field(
arq_queue_url: EnvRedisDsn = Field(
"redis://localhost:6379/1", validation_alias="APP_ARQ_QUEUE_URL"
)

Expand All @@ -77,6 +78,9 @@ If your app uses a configuration system like ``pydantic.BaseSettings``, this exa
)
return redis_settings

The `safir.pydantic.EnvRedisDsn` type will automatically incorporate Redis location information from tox-docker.
See :ref:`pydantic-dsns` for more details.

Worker set up
-------------

Expand Down
2 changes: 2 additions & 0 deletions docs/user-guide/database.rst
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,8 @@ Safir uses the `asyncpg`_ PostgreSQL database driver.
Database support in Safir is optional.
To use it, depend on ``safir[db]`` in your pip requirements.

Also see :ref:`pydantic-dsns` for Pydantic types that help with configuring the PostgreSQL DSN.

Initializing a database
=======================

Expand Down
49 changes: 49 additions & 0 deletions docs/user-guide/pydantic.rst
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,55 @@ Utilities for Pydantic models
Several validation and configuration problems arise frequently with Pydantic models.
Safir offers some utility functions to assist in solving them.

.. _pydantic-dsns:

Configuring PostgreSQL and Redis DSNs
=====================================

Databases and other storage services often use a :abbr:`DSN (Data Source Name)` to specify how to connect to the service.
Pydantic provides multiple pre-defined types to parse and validate those DSNs, including ones for PostgreSQL and Redis.

Safir applications often use tox-docker_ to start local PostgreSQL and Redis servers before running tests.
tox-docker starts services on random loopback IP addresses and ports, and stores the hostname and IP address in standard environment variables.

Safir provides alternative data types for PostgreSQL and Redis DSNs that behave largely the same as the Pydantic data types if the tox-docker environment variables aren't set.
If the tox-docker variables are set, their contents are used to override the hostname and port of any provided DSN with the values provided by tox-docker.
This allows the application to get all of its configuration from environment variables at module load time without needing special code in every application to handle the tox-docker environment variables.

For PostgreSQL DSNs, use the data type `safir.pydantic.EnvAsyncPostgresDsn` instead of `pydantic.PostgresDsn`.
This type additionally forces the scheme of the PostgreSQL DSN to either not specify the underying library or to specify asyncpg, allowing it to work correctly with the :doc:`Safir database API <database>`.
Unlike the Pydantic type, `~safir.pydantic.EnvAsyncPostgresDsn` only supports a single host.

For Redis DSNs, use the data type `safir.pydantic.EnvRedisDsn` instead of `pydantic.RedisDsn`.

For example:

.. code-block:: python

from pydantic_settings import BaseSettings, SettingsConfigDict
from safir.pydantic import EnvAsyncPostgresDsn, EnvRedisDsn


class Config(BaseSettings):
database_url: EnvAsyncPostgresDsn
redis_url: EnvRedisDsn

model_config = SettingsConfigDict(
env_prefix="EXAMPLE_", case_sensitive=False
)

These types only adjust DSNs initialized as normal.
They do not synthesize DSNs if none are set.
Therefore, the application will still need to set the corresponding environment variables in :file:`tox.ini` for testing purposes, although the hostname and port can be dummy values.
In this case, that would look something like:

.. code-block:: ini

[testenv:py]
setenv =
EXAMPLE_DATABASE_URL = postgresql://example@localhost/example
EXAMPLE_REDIS_URL = redis://localhost/0

.. _pydantic-datetime:

Normalizing datetime fields
Expand Down
93 changes: 92 additions & 1 deletion src/safir/pydantic.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,11 +2,19 @@

from __future__ import annotations

import os
from collections.abc import Callable
from datetime import UTC, datetime, timedelta
from typing import Annotated, Any, ParamSpec, TypeAlias, TypeVar

from pydantic import BaseModel, BeforeValidator, ConfigDict
from pydantic import (
AfterValidator,
BaseModel,
BeforeValidator,
ConfigDict,
UrlConstraints,
)
from pydantic_core import Url

from .datetime import parse_timedelta

Expand All @@ -15,6 +23,8 @@

__all__ = [
"CamelCaseModel",
"EnvAsyncPostgresDsn",
"EnvRedisDsn",
"HumanTimedelta",
"SecondsTimedelta",
"normalize_datetime",
Expand All @@ -24,6 +34,87 @@
]


def _validate_env_async_postgres_dsn(v: Url) -> Url:
"""Possibly adjust a PostgreSQL DSN based on environment variables.

When run via tox and tox-docker, the PostgreSQL hostname and port will be
randomly selected and exposed only in environment variables. We have to
patch that into the database URL at runtime since `tox doesn't have a way
of substituting it into the environment
<https://github.com/tox-dev/tox-docker/issues/55>`__.
"""
if port := os.getenv("POSTGRES_5432_TCP_PORT"):
return Url.build(
scheme=v.scheme,
username=v.username,
password=v.password,
host=os.getenv("POSTGRES_HOST", v.unicode_host() or "localhost"),
port=int(port),
path=v.path.lstrip("/") if v.path else v.path,
query=v.query,
fragment=v.fragment,
)
else:
return v


EnvAsyncPostgresDsn: TypeAlias = Annotated[
Url,
UrlConstraints(
host_required=True,
allowed_schemes=["postgresql", "postgresql+asyncpg"],
),
AfterValidator(_validate_env_async_postgres_dsn),
]
"""Async PostgreSQL data source URL honoring Docker environment variables.

Unlike the standard Pydantic ``PostgresDsn`` type, this type does not support
multiple hostnames because Safir's database library does not support multiple
hostnames.
"""


def _validate_env_redis_dsn(v: Url) -> Url:
"""Possibly adjust a Redis DSN based on environment variables.

When run via tox and tox-docker, the Redis hostname and port will be
randomly selected and exposed only in environment variables. We have to
patch that into the Redis URL at runtime since `tox doesn't have a way of
substituting it into the environment
<https://github.com/tox-dev/tox-docker/issues/55>`__.
"""
if port := os.getenv("REDIS_6379_TCP_PORT"):
return Url.build(
scheme=v.scheme,
username=v.username,
password=v.password,
host=os.getenv("REDIS_HOST", v.unicode_host() or "localhost"),
port=int(port),
path=v.path.lstrip("/") if v.path else v.path,
query=v.query,
fragment=v.fragment,
)
else:
return v


EnvRedisDsn: TypeAlias = Annotated[
Url,
UrlConstraints(
allowed_schemes=["redis"],
default_host="localhost",
default_port=6379,
default_path="/0",
),
AfterValidator(_validate_env_redis_dsn),
]
"""Redis data source URL honoring Docker environment variables.

Unlike the standard Pydantic ``RedisDsn`` type, this does not support the
``rediss`` scheme, which indicates the use of TLS.
"""


def _validate_human_timedelta(v: str | float | timedelta) -> float | timedelta:
if not isinstance(v, str):
return v
Expand Down
115 changes: 115 additions & 0 deletions tests/pydantic_test.py
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,8 @@

from safir.pydantic import (
CamelCaseModel,
EnvAsyncPostgresDsn,
EnvRedisDsn,
HumanTimedelta,
SecondsTimedelta,
normalize_datetime,
Expand All @@ -24,6 +26,119 @@
)


def test_env_async_postgres_dsn(monkeypatch: pytest.MonkeyPatch) -> None:
class TestModel(BaseModel):
dsn: EnvAsyncPostgresDsn

monkeypatch.delenv("POSTGRES_5432_TCP_PORT", raising=False)
monkeypatch.delenv("POSTGRES_HOST", raising=False)
model = TestModel.model_validate(
{"dsn": "postgresql://localhost:7777/some-database"}
)
assert model.dsn.scheme == "postgresql"
assert not model.dsn.username
assert not model.dsn.password
assert model.dsn.host == "localhost"
assert model.dsn.port == 7777
assert model.dsn.path == "/some-database"
assert not model.dsn.query

model = TestModel.model_validate(
{
"dsn": (
"postgresql+asyncpg://user:password@localhost/other"
"?connect_timeout=10"
)
}
)
assert model.dsn.scheme == "postgresql+asyncpg"
assert model.dsn.username == "user"
assert model.dsn.password == "password"
assert model.dsn.host == "localhost"
assert not model.dsn.port
assert model.dsn.path == "/other"
assert model.dsn.query == "connect_timeout=10"

monkeypatch.setenv("POSTGRES_5432_TCP_PORT", "8999")
model = TestModel.model_validate(
{
"dsn": (
"postgresql://user:password@localhost/other?connect_timeout=10"
)
}
)
assert model.dsn.scheme == "postgresql"
assert model.dsn.username == "user"
assert model.dsn.password == "password"
assert model.dsn.host == "localhost"
assert model.dsn.port == 8999
assert model.dsn.path == "/other"
assert model.dsn.query == "connect_timeout=10"

monkeypatch.setenv("POSTGRES_HOST", "example.com")
model = TestModel.model_validate({"dsn": "postgresql://localhost/other"})
assert model.dsn.scheme == "postgresql"
assert not model.dsn.username
assert not model.dsn.password
assert model.dsn.host == "example.com"
assert model.dsn.port == 8999
assert model.dsn.path == "/other"
assert not model.dsn.query

with pytest.raises(ValidationError):
TestModel.model_validate(
{"dsn": "postgresql+psycopg2://localhost/other"}
)


def test_env_redis_dsn(monkeypatch: pytest.MonkeyPatch) -> None:
class TestModel(BaseModel):
dsn: EnvRedisDsn

monkeypatch.delenv("REDIS_6379_TCP_PORT", raising=False)
monkeypatch.delenv("REDIS_HOST", raising=False)
model = TestModel.model_validate(
{"dsn": "redis://user:[email protected]:7777/1"}
)
assert model.dsn.scheme == "redis"
assert model.dsn.username == "user"
assert model.dsn.password == "password"
assert model.dsn.host == "example.com"
assert model.dsn.port == 7777
assert model.dsn.path == "/1"

model = TestModel.model_validate({"dsn": "redis://localhost"})
assert model.dsn.scheme == "redis"
assert not model.dsn.username
assert not model.dsn.password
assert model.dsn.host == "localhost"
assert model.dsn.port == 6379
assert model.dsn.path == "/0"

monkeypatch.setenv("REDIS_6379_TCP_PORT", "4567")
model = TestModel.model_validate(
{"dsn": "redis://user:[email protected]:7777/1"}
)
assert model.dsn.scheme == "redis"
assert model.dsn.username == "user"
assert model.dsn.password == "password"
assert model.dsn.host == "example.com"
assert model.dsn.port == 4567
assert model.dsn.path == "/1"

monkeypatch.setenv("REDIS_HOST", "127.12.0.1")
model = TestModel.model_validate({"dsn": "redis://localhost"})
assert model.dsn.scheme == "redis"
assert not model.dsn.username
assert not model.dsn.password
assert model.dsn.host == "127.12.0.1"
assert model.dsn.port == 4567
assert model.dsn.path == "/0"

with pytest.raises(ValidationError):
TestModel.model_validate({"dsn": "rediss://example.com/0"})


def test_human_timedelta() -> None:
class TestModel(BaseModel):
delta: HumanTimedelta
Expand Down