Skip to content

Longevity Migrations Ideas for Future Directions

John Sullivan edited this page Oct 16, 2017 · 11 revisions

The current implementation (release 0.26) of longevity migrations is pretty barebones, and satisfies two basic needs. First, it satisfies the basic need to migrate the schema and database backing your domain. Second, it is a platform we can work with to generate more sophisticated migrations. There are two major dimensions in which the migrations implementation can be improved: expressivity; and performance.

This document outlines some ideas I have for improving and extending the existing migrations codebase.

Unchanged Migration Steps

Presently, there are exactly three kinds of migration steps: Drop, Create, and Update. The first two remove and add persistent types to the domain model, respectively. The Update step migrates the data from the initial domain model into the final domain model according to a Scala function that migrates the data on a per-object basis.

An Unchanged migration step would be a clear option to to add to the existing steps. This would save the user the hassle of having to define what is essentially an identity function, but actually isn't, because the function has to convert from, say, version1.User to version2.User. Even if these have exactly the same shape, and identity function clearly won't do, because the to users have different types.

Aside from making life slightly easier for the user, this would expose a very nice optimization in some cases - the table could just be renamed instead of having to apply the conversion function in Scala memory. (Note that along with renaming the table, we may well have to tear down and build some database keys and indexes.)

This optimization would not be available for Cassandra, because you cannot rename a table in Cassandra. Also, take note that if the structure of the primary key changes, you will not be able to reuse the same table. You will have to rebuild it. However, Cassandra back end would probably still be able to perform an Unchanged migration without loading the data into Scala memory. We'd probably have to use something like the CQL COPY command.

Migration steps that are aware of the initial database

One kind of migration that is not currently possible is one that merges to persistent types into a single persistent type. Think merging two entity aggregates into a single entity aggregate. For instance, maybe you have User and UserProfile persistent types, and want to merge them into a single persistent type User.

This is certainly possible to do with longevity, but not within a single migration. One possibility here would be to provide a InitialAwareUpdate step, with a better name than that hopefully, that would take a function that also takes a Repo[M1] as argument. So where we have:

Migration.Builder.update[P1, P2](f: P1 => P2)

We could also have:

Migration.Builder.updateInitialAware[P1, P2](f: (P1, Repo[M1]) => P2)

One complication here: we probably want to have this function return an IO[P2] (or a more generic F[P2]), so that the user can actually use the Repo without some unsafe blocking stuff.

Migration steps that are aware of the final database

Another use case would be to introduce a new read view on our data. As an overly simplistic example, suppose we wanted to introduce UserView, that exposed only a subset of the data in a User. We can accomplish this in longevity, but not in a migration. To assist in situations like this, we can give the user access to the final repository, so they can read and update rows in the final model. Something like:

Migration.Builder.updateFinalAware[P1, P2](f: (P1, Repo[M2]) => P2)

We need to make clear to the user that the contents of the data available via Repo[M2] is in an intermediate state, as we are in the process of producing that data.

We should probably also be able to combine these two ideas, so the user has access to both the initial and final states. We could also consider a case where the migration is not actually one-to-one, but where the user has free reign to write to the M2 database based on each row is sees from M1. Something like:

Migration.Builder.updateFreely[P1, P2](f: (P1, Repo[M1], Repo[M2]) => IO[Unit])

Generalize the Migrator to work with other effects aside from cats.effect.IO

Probably wouldn't be too hard... it would probably be a pretty good idea to provide implicits for cats.Cartesian, cats.Applicative, and things like that, for any longevity.effect.Effect.

TODO review

Migration steps described via properties, (similar to query language), instead of by functions

todo

Soft stop migrations

todo

Hard Stop Migrations

todo