Merge pull request #327 from lsst-sqre/tickets/DM-47262b

DM-47262: Document testing Alembic config against schema
lsst-sqre · Nov 14, 2024 · cda04c2 · cda04c2
2 parents 4f13093 + 59c9d11
commit cda04c2
Show file tree

Hide file tree

Showing 6 changed files with 129 additions and 9 deletions.
diff --git a/changelog.d/20241113_165421_rra_DM_47262b.md b/changelog.d/20241113_165421_rra_DM_47262b.md
@@ -0,0 +1,7 @@
+### New features
+
+- Add `safir.database.drop_database` utility function to drop all database tables mentioned in a SQLAlchemy ORM schema and delete the Alembic version information if it is present. This is primarily useful for tests.
+
+### Other changes
+
+- Document how to test an application's database schema against its Alembic migrations to ensure that the schema hasn't been modified without adding a corresponding migration.
diff --git a/docs/user-guide/database/schema.rst b/docs/user-guide/database/schema.rst
@@ -328,10 +328,11 @@ This ensures that the checks for the schema will pass when executing tests.
 When cleaning out the test database between tests, call `~safir.database.unstamp_database` after dropping the application's database tables if you want to fully reset the database to its state before running the test.
 The ``reset=True`` flag of `~safir.database.initialize_database` does not do this.
 
-Creating database migrations
-============================
+Create the initial database migration
+=====================================
 
-Whenever the database schema changes, you will need to create an Alembic migration.
+Alembic works more smoothly if the first release of the service has an initial database migration, at the head of the migration dependency chain, that creates the full database schema.
+You should therefore generate an Alembic database migration from an empty database once you've configured Alembic for the first time.
 
 Add a docker-compose configuration
 ----------------------------------
@@ -380,7 +381,7 @@ Add the environment variable settings to that environment that tell your applica
 
 Change the database name, username, and environment variable prefix to match your application.
 
-You will also need a tox environment that runs your application's command-line interface.
+For later migrations (although not for the first migration), you will also need a tox environment that runs your application's command-line interface.
 This will look something like the following:
 
 .. code-block:: toml
@@ -397,10 +398,33 @@ This will look something like the following:
 
 As above, change the database name, username, command name, and environment variable prefix to match your application.
 
-Create the migration
---------------------
+Create the initial migration
+----------------------------
+
+#. Start a PostgreSQL server with an empty database.
+
+   .. prompt:: bash
+
+      docker-compose -f alembic/docker-compose.yaml up
+
+#. Ask Alembic to autogenerate a database migration from that empty database to the initial schema.
 
-You're now ready to create the database migration.
+   .. prompt:: bash
+
+      tox run -e alembic -- revision --autogenerate -m "Initial schema."
+
+   This will create a new file in :file:`alembic/versions`.
+
+#. Stop the running PostgreSQL container.
+
+   .. prompt:: bash
+
+      docker-compose -f alembic/docker-compose.yaml down
+
+Creating database migrations
+============================
+
+Whenever the database schema changes, you will need to create an Alembic migration.
 
 #. Start a PostgreSQL server into which the current database schema can be created.
 
@@ -471,3 +495,45 @@ The details of how to set up this Helm hook will depend on the details of your a
 Applications that use CloudSQL will need some special support for running the CloudSQL sidedar.
 
 See `the Gafaelfawr Helm chart in Phalanx <https://github.com/lsst-sqre/phalanx/tree/main/applications/gafaelfawr>`__ for an example.
+
+.. _database-alembic-testing:
+
+Testing database migrations
+===========================
+
+Now that your application is using Alembic, you will want to test that you do not accidentally introduce a database schema change.
+
+The easiest way to do this is to add a schema test that fails if the schema created by applying all Alembic migrations does not match the current SQLAlchemy ORM schema definition.
+This will require adding an Alembic migration at the same time as a schema change, which is generally what you want.
+
+The following test can be dropped into :file:`tests/schema_test.py` and should work for most applications that follow the Safir documentation.
+
+.. code-block:: python
+
+   import subprocess
+
+   import pytest
+   from safir.database import create_database_engine, drop_database
+
+   from example.config import config
+   from example.schema import SchemaBase
+
+
+   @pytest.mark.asyncio
+   async def test_schema() -> None:
+       engine = create_database_engine(
+           config.database_url, config.database_password
+       )
+       await drop_database(engine, SchemaBase.metadata)
+       await engine.dispose()
+       subprocess.run(["alembic", "upgrade", "head"], check=True)
+       subprocess.run(["alembic", "check"], check=True)
+
+As always, replace ``example`` with the module of your application.
+This assumes that ``example.schema.SchemaBase`` is the declarative base of your SQLAlchemy ORM schema.
+Adjust as needed for your application.
+
+.. warning::
+
+   This test can only catch schema changes that Alembic knows how to generate migrations for.
+   Changes that Alembic misses, such as changes to the membership of an enum, will not be caught by this test.
diff --git a/docs/user-guide/database/testing.rst b/docs/user-guide/database/testing.rst
@@ -7,6 +7,8 @@ While support for SQLite could be added, testing against the database that will
 
 The recommended strategy for testing applications that use a database is to start a real PostgreSQL server for the tests.
 
+Also see :ref:`database-alembic-testing` for information on how to test Alembic schema management.
+
 Using tox-docker
 ================
 

diff --git a/safir/src/safir/database/__init__.py b/safir/src/safir/database/__init__.py
@@ -11,7 +11,11 @@
 )
 from ._connection import create_async_session, create_database_engine
 from ._datetime import datetime_from_db, datetime_to_db
-from ._initialize import DatabaseInitializationError, initialize_database
+from ._initialize import (
+    DatabaseInitializationError,
+    drop_database,
+    initialize_database,
+)
 from ._retry import RetryP, RetryT, retry_async_transaction
 
 __all__ = [
@@ -21,6 +25,7 @@
     "create_database_engine",
     "datetime_from_db",
     "datetime_to_db",
+    "drop_database",
     "initialize_database",
     "is_database_current",
     "retry_async_transaction",

diff --git a/safir/src/safir/database/_initialize.py b/safir/src/safir/database/_initialize.py
@@ -10,8 +10,11 @@
 from sqlalchemy.sql.schema import MetaData
 from structlog.stdlib import BoundLogger
 
+from ._alembic import unstamp_database
+
 __all__ = [
     "DatabaseInitializationError",
+    "drop_database",
     "initialize_database",
 ]
 
@@ -20,6 +23,21 @@ class DatabaseInitializationError(Exception):
     """Database initialization failed."""
 
 
+async def drop_database(engine: AsyncEngine, schema: MetaData) -> None:
+    """Drop all tables from the database.
+
+    Parameters
+    ----------
+    engine
+        Engine to use to issue the SQL commands.
+    schema
+        Database ORM schema.
+    """
+    async with engine.begin() as conn:
+        await conn.run_sync(schema.drop_all)
+    await unstamp_database(engine)
+
+
 async def initialize_database(
     engine: AsyncEngine,
     logger: BoundLogger,

diff --git a/safir/tests/database_test.py b/safir/tests/database_test.py
@@ -23,6 +23,7 @@
     create_database_engine,
     datetime_from_db,
     datetime_to_db,
+    drop_database,
     initialize_database,
     is_database_current,
     retry_async_transaction,
@@ -36,7 +37,7 @@
 
 
 @pytest.mark.asyncio
-async def test_database_init(
+async def test_initialize_database(
     database_url: str, database_password: str
 ) -> None:
     logger = structlog.get_logger(__name__)
@@ -72,6 +73,27 @@ async def test_database_init(
     await engine.dispose()
 
 
+@pytest.mark.asyncio
+async def test_drop_database(
+    database_url: str, database_password: str
+) -> None:
+    logger = structlog.get_logger(__name__)
+    engine = create_database_engine(database_url, database_password)
+    await initialize_database(engine, logger, schema=BaseV2.metadata)
+    session = await create_async_session(engine, logger)
+    async with session.begin():
+        session.add(UserV2(username="someuser"))
+    await session.remove()
+
+    await drop_database(engine, BaseV2.metadata)
+    session = await create_async_session(engine, logger)
+    with pytest.raises(ProgrammingError):
+        async with session.begin():
+            await session.scalars(select(UserV2.username))
+    await session.remove()
+    await engine.dispose()
+
+
 def test_build_database_url(database_url: str) -> None:
     url = build_database_url("postgresql://[email protected]/foo", None)
     assert url == "postgresql+asyncpg://[email protected]/foo"