Introduce Magic Migrate #246

schneems · 2023-12-19T22:44:54Z

The goal of magic migrate is to simplify versioned metadata storage.

The problem

Cloud native build packs use toml to store metadata about a layer between builds. We commonly use this to store information like what ruby version was downloaded or a sha of some kind. On the next build we can look at this information to determine if some expensive process (like downloading a binary) can be skipped. Essentially we treat it like a cache key.

If we cannot load (deserialize) the old metadata into the currently requested structure then the default behavior is to clear the cache. This means that either the programmer must be careful to never make backwards incompatible changes to the metadata, or risk triggering a cache invalidation.

Now consider that we cannot guarantee that the cache was generated from the last version of the buildpack. Someone might deploy, then wait several years before deploying again.

The classic Ruby buildpack has this problem. It's cache is unversioned so if a mistake is made in the cache structure or contents in one version, then the fix must be hardcoded and checked on every subsequent deploy of every future version https://github.com/heroku/heroku-buildpack-ruby/blob/453b13983b638d68d9d65ab89d36a2fc18128e4a/lib/language_pack/ruby.rb#L1270-L1332.

Introducing magic migrate

Magic migrate doesn't make these problem go away, instead it makes the problem easier to reason about.

When the schema of the metadata changes, the programmer can introduce a new struct and tell rust how to migrate from one version to the next using either From or TryFrom (if fallible). Then they use the corresponding magic migrate trait to tell rust how to walk this chain backwards. Now when we try to load data from disk it will try to load the latest struc, if it can't it will go to the one before, and so-on. Once it finds the original serialized struct, it converts it forwards, one step at a time until we arrive at the currently desired struct.

Now instead of trying to hold all possible cache state in mind from all possible versions of the code, the programmer only needs to know how to make each conversion one by one.

commons/src/magic_migrate.rs

The goal of magic migrate is to simplify versioned metadata storage. ## The problem Cloud native build packs use toml to store metadata about a layer between builds. We commonly use this to store information like what ruby version was downloaded or a sha of some kind. On the next build we can look at this information to determine if some expensive process (like downloading a binary) can be skipped. Essentially we treat it like a cache key. If we cannot load (deserialize) the old metadata into the currently requested structure then the default behavior is to clear the cache. This means that either the programmer must be careful to never make backwards incompatible changes to the metadata, or risk triggering a cache invalidation. Now consider that we cannot guarantee that the cache was generated from the last version of the buildpack. Someone might deploy, then wait several years before deploying again. The classic Ruby buildpack has this problem. It's cache is unversioned so if a mistake is made in the cache structure or contents in one version, then the fix must be hardcoded and checked on every subsequent deploy of every future version https://github.com/heroku/heroku-buildpack-ruby/blob/453b13983b638d68d9d65ab89d36a2fc18128e4a/lib/language_pack/ruby.rb#L1270-L1332. ## Introducing magic migrate Magic migrate doesn't make these problem go away, instead it makes the problem easier to reason about. When the schema of the metadata changes, the programmer can introduce a new struct and tell rust how to migrate from one version to the next using either `From` or `TryFrom` (if fallible). Then they use the corresponding magic migrate trait to tell rust how to walk this chain backwards. Now when we try to load data from disk it will try to load the latest struc, if it can't it will go to the one before, and so-on. Once it finds the original serialized struct, it converts it forwards, one step at a time until we arrive at the currently desired struct. Now instead of trying to hold all possible cache state in mind from all possible versions of the code, the programmer only needs to know how to make each conversion one by one.

…bly migrated automatically

schneems commented Dec 19, 2023

View reviewed changes

commons/src/magic_migrate.rs Show resolved Hide resolved

schneems force-pushed the schneems/magic-migrate branch 2 times, most recently from 222f792 to 85415a7 Compare December 19, 2023 23:02

schneems mentioned this pull request Dec 20, 2023

Struct with tag and deny_unknown_fields cannot deserialize serde-rs/serde#2666

Open

schneems force-pushed the schneems/magic-migrate branch from 85415a7 to 6856d33 Compare January 8, 2024 18:50

schneems marked this pull request as ready for review January 8, 2024 18:58

schneems requested a review from a team as a code owner January 8, 2024 18:58

Allow any serialized TOML that can be infallibly migrated to be falli…

5f3b0f5

…bly migrated automatically

schneems marked this pull request as draft January 12, 2024 14:05

edmorley removed the request for review from a team January 26, 2024 22:47

schneems closed this May 17, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Introduce Magic Migrate #246

Introduce Magic Migrate #246

schneems commented Dec 19, 2023

Introduce Magic Migrate #246

Introduce Magic Migrate #246

Conversation

schneems commented Dec 19, 2023

The problem

Introducing magic migrate