UTxO-HD targeting `main` #1267

jasagredo · 2024-09-26T09:45:04Z

Description

The changes from UTxO-HD span over ouroboros-consensus, ouroboros-consensus-diffusion and ouroboros-consensus-cardano. The core change is:

The UTxO set is extracted from the LedgerState in the form of LedgerTables.
These tables are stored in the LedgerDB, which can keep them in memory or on disk.
When performing an action that requires UTxOs, we have to ask the LedgerDB for those. This might perform IO.

Here I will explain how I would review this enormous PR. Instead of listing files I will describe concepts, and my suggestion is to go look at the mentioned files (or search for the concepts) then mark the file as viewed to offload it from the brain.

The ledger tables

The first step would be to understand the concept of LedgerTables, see Ouroboros.Consensus.Ledger.Tables.* modules. The LedgerTables are parametrized by l (in the end it will be by blk) and by mk (or MapKinds). MapKinds are just types parametrized by the Key and Value of l. These will be TxIn|TxOut for unitary blocks and CanonicalTxIn|HardForkTxOut for hard fork blocks.
LedgerTables are barbies-like, see Ouroboros.Consensus.Ledger.Tables.Combinators.
LedgerTables are (most commonly) empty (EmptyMK), a (possibly restricted) UTxO set (ValuesMK), a set of TxIns (KeysMK), a sequence of differences (DiffMK) or a combination of values + diffs (TrackingMK). The only non-obvious one is DiffMK which is a map of sequences of changes to a value (in the UTxO case values don't change, they are created and destroyed, so there will be at most 2 elements there). On top of that there is a DiffSeqMK which is a fingertree of differences. Only used in V1 (see below).
The LedgerState is itself parametrized by this same mk. The data instances will then make use of that mk to define tables associated with the block. So the byron ledger state ignores it, the shelley ledger state has a new field with the tables and the hard fork ledger state will propagate the mk through the telescope, therefore having an mk of the particular state in the Telescope.
The LedgerTables can live on their own, which for unitary blocks don't make a difference, but for the Cardano Block, we go from an mk passed to the Telescope (therefore tables at the tip of the Telescope) to CardanoLedgerTables, in which each value is a HardForkTxOut. This cost is non-trivial and we only want to pay it when applying a new block/transaction.
LedgerTables can be extracted and injected into the ledger state via (un)stowLedgerTables.
The ledger tables of the Extended ledger state are the same as the ones form the LedgerState.
A very important bit that maybe was not clear above is that the HardForkBlock has no canonical tables because our definitions are not compositional for the HF block, only the CardanoBlock has "hard fork tables". See the constraints of HasHardForkLedgerTables.

Applying and ticking (Ouroboros.Consensus.Ledger.Abstract/Basics)

When ticking a block, some differences might be created, and no values are needed. So the types go from l EmptyMK to Ticked1 l DiffMK. This is the case at least in two moments: when going from Byron to Shelley (all values are created here) and when going from Shelley to Allegra (avvm addresses are deleted). See the relevant functions: translateLedgerStateByronToShelley and translateLedgerStateShelleyToAllegra.

When applying a block, we get the inputs needed (getBlockKeySets then read those from the LedgerDB), tick the ledger state without tables (possibly creating diffs), apply those diffs on the values from the LedgerDB, then call the ledger rules. We then diff the input and output tables to get a set of differences from applying a block, to which we will prepend the ones from ticking. See applyBlockResult and the Shelley functions for applying blocks.

The story with transactions is pretty similar.

The LedgerDB versions (Ouroboros.Consensus.Storage.LedgerDB)

There are two flavors of the LedgerDB, each one having two implementations:

V1 (Ouroboros.Consensus.Storage.LedgerDB.V1): we keep a sequence of EmptyMK ledger states and dump the values into a BackingStore. We can get back values from the backing store at any ledger state, by opening a BackingStoreValueHandle and reading from it. The BackingStore consists of a "complete" UTxO set at some anchor and then a sequence of differences. To get values at a given point we have to read the anchor, then reapply the differences up to the desired point. This is "wasteful" if done in memory (why keep diffs and have to reapply them every time if we can just apply them in place?) but it is useful on the on-disk implementation which puts the "complete" UTxO set on the disk, offloading it from memory. There are two implementations:
- OnDisk: It uses LMDB underneath. See the Ouroboros.Consensus.Storage.LedgerDB.V1.BackingStore.Impl.LMDB.* modules.
- InMemory: Not intended for real use. As mentioned above it is wasteful. It serves as a reference impl for the OnDisk implementation.
V2 (Ouroboros.Consensus.Storage.LedgerDB.V2): We keep a sequence of StateRefs, which are EmptyMK ledger states together with a tables handle from which we can read values monadically. This is very similar to the previous LedgerDB, in which we kept a sequence of (complete) LedgerStates. There are two implementations:
- InMemory
- LSM: still a WIP

Evaluating forks

In order to evaluate forks, we created the concept of Forkers, where each LedgerDB implementation has their own concept. They are just an abstract interface that allows to query for values and push differences that eventually can be dumped back into the LedgerDB (only by ChainSelection, others use ReadOnlyForkers). Note that they allocate resources so there is some juggling with ResourceRegistries there.

Ledger queries (Ouroboros.Consensus.Ledger.Query)

Some queries will have to look at the UTxO set, in particular GetUtxoByAddress, GetUtxoWhole and GetUtxoByTxin. We categorize them by the means of QueryFootprint. We will process each one of them differently.

Other queries use QFNoTables, GetUtxoByTxIn uses QFLookupTables and will have to read a single value from the tables, and GetUtxoWhole and GetUtxoByAddress use QFTraverseTables as they will have to scan the whole UTxO set.

For the HardForkBlock there is another class Ouroboros.Consensus.HardFork.Combinator.Ledger.Query.BlockSupportsHFLedgerQuery which has faster implementations than projecting the tables into the particular tip of the Telescope, because we can usually judge whether we want the result without upgrading the TxOut to the latest era.

In essence, queries are now monadic. Queries that don't look at the UTxO set are artificially monadic (just a pure of the already existing logic).

The mempool

The mempool in essence will have to acquire (read only) forkers on the LedgerDB at the tip, then read values for the incoming transactions and apply them. The returned diffs are appended to the ones in the mempool, which keeps a TrackingMK with the current values and past diffs.

When revalidating transactions we cannot know if the UTxO set changed so we will have to re-read the values from the (new) forker.

The internal state is now a TMVar because we need to acquire >> read tables >> update where read tables is in IO and the others are in STM.

The snapshots

We now store snapshots in a new format:

V1-OnDisk: a copy of the lmdb database and a (Haskell-CBOR) serialization of the LedgerState.
V*-InMemory: a (Haskell-CBOR) serialization of the UTxO set and a (Haskell-CBOR) serialization of the LedgerState.

Note that for V2 we can take snapshots at any time of the immutable tip, but for v1 we have to take flush some differences from the BackingStore into the anchor to advance it to the immutable tip.

This is abstracted by either implementation in Ouroboros.Consensus.Storage.LedgerDB.V*...tryTakeSnapshot

The forging loop

The forging loop didn't change much. Each iteration runs with a resource registry (to allocate the forkers). Then we use the forker to provide values for the mempool snapshot acquisition, in case of a revalidation.

Changes in Byron/Shelley/Cardano

The changes here are mostly fulfilling everything that was described above, to make all the types match. There are some specific things which are interesting to look at because they might be non-trivial:

Translation functions (with the two examples I already mentioned)
The TxIn|TxOut data instances, the LedgerState data instance and the HasLedgerTables instances
applyBlock for shelley. The cardano one is just the HFC one, which injects the CardanoTables into the tip of the Telescope (here is where we do the costly step, but it usually won't be that costly because the UTxO set for a block is small).
The Cardano.Ledger module which defines the CardanoTxIn and CardanoTxOut.

Other changes

The rest of the changes are mainly just following GHC adjusting the types here and there. Most other code doesn't use tables so an abstract mk or EmptyMK is used to make the kind well-formed.

jasagredo

I did a pass over the non-testing libraries.

ouroboros-consensus/src/ouroboros-consensus/Ouroboros/Consensus/Mempool/Query.hs

...boros-consensus/src/ouroboros-consensus/Ouroboros/Consensus/Storage/ChainDB/Impl/ChainSel.hs

ouroboros-consensus/src/ouroboros-consensus/Ouroboros/Consensus/Storage/LedgerDB/API.hs

...os-consensus/src/ouroboros-consensus/Ouroboros/Consensus/Storage/LedgerDB/V1/BackingStore.hs

...onsensus/src/ouroboros-consensus/Ouroboros/Consensus/Storage/LedgerDB/V1/BackingStore/API.hs

...ros-consensus/src/ouroboros-consensus/Ouroboros/Consensus/Storage/LedgerDB/V1/DbChangelog.hs

...oros-consensus-diffusion/src/ouroboros-consensus-diffusion/Ouroboros/Consensus/NodeKernel.hs

ouroboros-consensus-cardano/src/shelley/Ouroboros/Consensus/Shelley/ShelleyHFC.hs

nfrisby

This is the result of my first pass on the Ouroboros.Consensus.Ledger.Tables.* modules.

ouroboros-consensus/src/ouroboros-consensus/Ouroboros/Consensus/Ledger/Tables.hs

ouroboros-consensus/src/ouroboros-consensus/Ouroboros/Consensus/Ledger/Tables/Utils.hs

ouroboros-consensus/src/ouroboros-consensus/Ouroboros/Consensus/Ledger/Tables/MapKind.hs

nfrisby

Another round of comments. This is all of the *Hard* files, except for Query.hs.

...consensus-cardano/src/ouroboros-consensus-cardano/Ouroboros/Consensus/Cardano/CanHardFork.hs

...-consensus-cardano/src/unstable-cardano-testlib/Test/ThreadNet/Infra/ShelleyBasedHardFork.hs

ouroboros-consensus-diffusion/test/consensus-test/Test/Consensus/HardFork/Combinator.hs

ouroboros-consensus-diffusion/test/consensus-test/Test/Consensus/HardFork/Combinator/A.hs

...boros-consensus/src/ouroboros-consensus/Ouroboros/Consensus/HardFork/Combinator/InjectTxs.hs

...ros-consensus/src/ouroboros-consensus/Ouroboros/Consensus/HardFork/Combinator/State/Types.hs

ouroboros-consensus/src/ouroboros-consensus/Ouroboros/Consensus/Ledger/Tables.hs

nfrisby

All *Query*hs files, except:

ouroboros-consensus/src/ouroboros-consensus/Ouroboros/Consensus/Storage/ChainDB/Impl/Query.hs
ouroboros-consensus/test/consensus-test/Test/Consensus/MiniProtocol/LocalStateQuery/Server.hs
ouroboros-consensus/src/ouroboros-consensus/Ouroboros/Consensus/Mempool/Query.hs

...ros-consensus-cardano/src/ouroboros-consensus-cardano/Ouroboros/Consensus/Cardano/QueryHF.hs

ouroboros-consensus-cardano/src/shelley/Ouroboros/Consensus/Shelley/Ledger/Query.hs

...os-consensus/src/ouroboros-consensus/Ouroboros/Consensus/HardFork/Combinator/Ledger/Query.hs

ouroboros-consensus/src/ouroboros-consensus/Ouroboros/Consensus/Ledger/Query.hs

nfrisby

*LedgerDB* files, except I stopped when I got to the LMDB impl. I'll pick up there tomorrow.

ouroboros-consensus/src/ouroboros-consensus/Ouroboros/Consensus/Storage/LedgerDB/API.hs

ouroboros-consensus/src/ouroboros-consensus/Ouroboros/Consensus/Storage/LedgerDB/V1/Args.hs

...onsensus/src/ouroboros-consensus/Ouroboros/Consensus/Storage/LedgerDB/V1/BackingStore/API.hs

...rc/ouroboros-consensus/Ouroboros/Consensus/Storage/LedgerDB/V1/BackingStore/Impl/InMemory.hs

nfrisby

My previous review was the *LedgerDB* files up to but excluding LMDB.

This review picks up there and stops before V2.

...us/src/ouroboros-consensus/Ouroboros/Consensus/Storage/LedgerDB/V1/BackingStore/Impl/LMDB.hs

ouroboros-consensus/src/ouroboros-consensus/Ouroboros/Consensus/Storage/LedgerDB/V1/Lock.hs

...boros-consensus/src/ouroboros-consensus/Ouroboros/Consensus/Storage/LedgerDB/V1/Snapshots.hs

nfrisby

This is the LedgerDB*V2 files

ouroboros-consensus/src/ouroboros-consensus/Ouroboros/Consensus/Storage/LedgerDB/V2/Common.hs

...boros-consensus/src/ouroboros-consensus/Ouroboros/Consensus/Storage/LedgerDB/V2/LedgerSeq.hs

ouroboros-consensus/src/ouroboros-consensus/Ouroboros/Consensus/Storage/LedgerDB/V2/Init.hs

ouroboros-consensus/src/ouroboros-consensus/Ouroboros/Consensus/Storage/LedgerDB/V2/LSM.hs

nfrisby

This is the LedgerDB files after V2, ie the tests.

ouroboros-consensus/test/storage-test/Test/Ouroboros/Storage/LedgerDB.hs

ouroboros-consensus/test/storage-test/Test/Ouroboros/Storage/LedgerDB/StateMachine/TestBlock.hs

ouroboros-consensus/test/storage-test/Test/Ouroboros/Storage/LedgerDB/StateMachine.hs

ouroboros-consensus/test/storage-test/Test/Ouroboros/Storage/LedgerDB/V1/DbChangelog/Unit.hs

ouroboros-consensus-cardano/app/DBAnalyser/Parsers.hs

ouroboros-consensus-cardano/app/snapshot-converter.hs

...consensus-cardano/src/ouroboros-consensus-cardano/Ouroboros/Consensus/Cardano/CanHardFork.hs

ouroboros-consensus-cardano/src/shelley/Ouroboros/Consensus/Shelley/ShelleyHFC.hs

ouroboros-consensus-cardano/src/unstable-byron-testlib/Test/Consensus/Byron/Generators.hs

...ros-consensus-cardano/src/unstable-byronspec/Ouroboros/Consensus/ByronSpec/Ledger/Mempool.hs

ouroboros-consensus-cardano/src/unstable-shelley-testlib/Test/Consensus/Shelley/Examples.hs

...-consensus-cardano/src/unstable-cardano-testlib/Test/ThreadNet/Infra/ShelleyBasedHardFork.hs

ouroboros-consensus/src/ouroboros-consensus/Ouroboros/Consensus/HardFork/Combinator/Ledger.hs

ouroboros-consensus/src/ouroboros-consensus/Ouroboros/Consensus/Storage/LedgerDB/API.hs

ouroboros-consensus/src/ouroboros-consensus/Ouroboros/Consensus/Storage/LedgerDB/V1/Args.hs

...oros-consensus/src/ouroboros-consensus/Ouroboros/Consensus/Storage/LedgerDB/Impl/Validate.hs

...us/src/ouroboros-consensus/Ouroboros/Consensus/Storage/LedgerDB/V1/BackingStore/Impl/LMDB.hs

ouroboros-consensus/src/ouroboros-consensus/Ouroboros/Consensus/Storage/ChainDB.hs

...ros-consensus/src/ouroboros-consensus/Ouroboros/Consensus/Storage/ChainDB/Impl/Background.hs

...boros-consensus/src/ouroboros-consensus/Ouroboros/Consensus/Storage/ChainDB/Impl/ChainSel.hs

ouroboros-consensus/test/storage-test/Test/Ouroboros/Storage/ChainDB/Model.hs

ouroboros-consensus/test/storage-test/Test/Ouroboros/Storage/ChainDB/StateMachine.hs

ouroboros-consensus-diffusion/src/unstable-diffusion-testlib/Test/ThreadNet/General.hs

ouroboros-consensus-diffusion/src/unstable-diffusion-testlib/Test/ThreadNet/Network.hs

ouroboros-consensus/src/unstable-consensus-testlib/Test/Util/ChainDB.hs

ouroboros-consensus/src/unstable-consensus-testlib/Test/Util/Orphans/Arbitrary.hs

...boros-consensus-diffusion/src/unstable-mock-testlib/Test/Consensus/Ledger/Mock/Generators.hs

ouroboros-consensus/test/consensus-test/Test/Consensus/Ledger/Tables/Diff.hs

ouroboros-consensus/test/consensus-test/Test/Consensus/Ledger/Tables/DiffSeq.hs

ouroboros-consensus/test/consensus-test/Test/Consensus/MiniProtocol/BlockFetch/Client.hs

ouroboros-consensus/test/storage-test/Main.hs

ouroboros-consensus/src/ouroboros-consensus/Ouroboros/Consensus/Ledger/Tables/DiffSeq.hs

ouroboros-consensus/src/ouroboros-consensus/Ouroboros/Consensus/Ledger/Tables/Diff.hs

ouroboros-consensus/src/ouroboros-consensus/Ouroboros/Consensus/Ledger/Tables.hs

This type for storing resources was reinventing the wheel: `quickcheck-dynamic` already keep track of resources by storing a `Var` for each action result. `IOSim` support for tests is also removed. It would be straightforward to revive `IOSim` support in the future, if necessary.

* Rename `MockState` to `MockMonad`. * Remove exception handler hoop jumping in `mBSClose` and `mBSVHClose`. * Tag `ReadAfterWrite` and `RangeReadAfterWrite` only once per action sequence. * Resolve some TODOs

I had initially decided it was best to replace uses of `ltcollapse` by some new `ltfoldMap` function, but after some thinking it's best to instead use `ltcollapse` as is, but removing the use of monoids there, which currently has no effect. The reason I think this is the right call is because ledger tables are currently a single-constructor newtype, and they will be for at least a while. We do not know what ledger tables will look like when we store more parts of the ledger state, so let's cross that bridge when we get there.

jasagredo added the UTxO-HD label Sep 26, 2024

jasagredo force-pushed the utxo-hd-main branch from 961fb5c to 237372f Compare September 27, 2024 09:14

jasagredo force-pushed the utxo-hd-main branch 2 times, most recently from b2d53b0 to 9395433 Compare October 8, 2024 12:59

jasagredo force-pushed the utxo-hd-main branch 6 times, most recently from eb278d3 to 6b427d2 Compare October 24, 2024 10:22

jasagredo force-pushed the utxo-hd-main branch from 6b427d2 to a851cd7 Compare October 24, 2024 10:54

jasagredo commented Oct 24, 2024

View reviewed changes

jasagredo force-pushed the utxo-hd-main branch from a851cd7 to c252023 Compare October 24, 2024 13:16

jasagredo marked this pull request as ready for review October 24, 2024 13:29

jasagredo requested review from nfrisby, amesgen, fraser-iohk and dnadales as code owners October 24, 2024 13:29

jasagredo changed the title ~~WIP: UTxO-HD targeting main~~ UTxO-HD targeting main Oct 24, 2024

nfrisby reviewed Oct 29, 2024

View reviewed changes

jorisdral reviewed Nov 1, 2024

View reviewed changes

nfrisby reviewed Nov 5, 2024

View reviewed changes

nfrisby reviewed Nov 6, 2024

View reviewed changes

ouroboros-consensus-cardano/app/DBAnalyser/Parsers.hs Outdated Show resolved Hide resolved

amesgen reviewed Nov 8, 2024

View reviewed changes

jorisdral reviewed Nov 11, 2024

View reviewed changes

This was referenced Nov 21, 2024

Parametrize LedgerTables by blk #1317

Open

QueryHF code is excessively verbose #1318

Closed

Move the QueryBatchSize field into the BackingStore #1321

Closed

amesgen reviewed Nov 22, 2024

View reviewed changes

jorisdral reviewed Nov 25, 2024

View reviewed changes

jasagredo mentioned this pull request Nov 25, 2024

DbSynthesizer should be able to run with different backends #1330

Open

jorisdral mentioned this pull request Dec 4, 2024

Re-enable the GetLedgerDB action in the ChainDB QSM tests #1339

Open

UTXO-HD

34726b9

jasagredo force-pushed the utxo-hd-main branch from c252023 to 34726b9 Compare December 9, 2024 15:44

jasagredo requested a review from geo2a as a code owner December 9, 2024 15:44

jasagredo mentioned this pull request Dec 10, 2024

UTxO-HD release IntersectMBO/cardano-node#5918

Open

9 tasks

jasagredo and others added 12 commits December 10, 2024 14:50

Code review changes

ec6483b

Resolve PR comments for BackingStore lockstep tests

f19a26d

* Rename `MockState` to `MockMonad`. * Remove exception handler hoop jumping in `mBSClose` and `mBSVHClose`. * Tag `ReadAfterWrite` and `RangeReadAfterWrite` only once per action sequence. * Resolve some TODOs

Update DiffSeq haddocks

3fcded8

Rework SOP code on HardForkCombinator

b1db8c0

Code-review changes

ba8f762

Reorganize LedgerDB

e524b42

Code-review changes

77c0275

Formatting

9b4061d

consensus: simplify some UTxO HD SOP code

f2eb2f5

consensus: abstact some query logic over UTxO HD footprints

4033b2d

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

UTxO-HD targeting `main` #1267

UTxO-HD targeting `main` #1267

jasagredo commented Sep 26, 2024 •

edited

Loading

jasagredo left a comment

nfrisby left a comment

nfrisby left a comment

nfrisby left a comment •

edited

Loading

nfrisby left a comment

nfrisby left a comment •

edited

Loading

nfrisby left a comment •

edited

Loading

nfrisby left a comment

UTxO-HD targeting main #1267

Are you sure you want to change the base?

UTxO-HD targeting main #1267

Conversation

jasagredo commented Sep 26, 2024 • edited Loading

Description

The ledger tables

Applying and ticking (Ouroboros.Consensus.Ledger.Abstract/Basics)

The LedgerDB versions (Ouroboros.Consensus.Storage.LedgerDB)

Evaluating forks

Ledger queries (Ouroboros.Consensus.Ledger.Query)

The mempool

The snapshots

The forging loop

Changes in Byron/Shelley/Cardano

Other changes

jasagredo left a comment

Choose a reason for hiding this comment

nfrisby left a comment

Choose a reason for hiding this comment

nfrisby left a comment

Choose a reason for hiding this comment

nfrisby left a comment • edited Loading

Choose a reason for hiding this comment

nfrisby left a comment

Choose a reason for hiding this comment

nfrisby left a comment • edited Loading

Choose a reason for hiding this comment

nfrisby left a comment • edited Loading

Choose a reason for hiding this comment

nfrisby left a comment

Choose a reason for hiding this comment

UTxO-HD targeting `main` #1267

UTxO-HD targeting `main` #1267

jasagredo commented Sep 26, 2024 •

edited

Loading

nfrisby left a comment •

edited

Loading

nfrisby left a comment •

edited

Loading

nfrisby left a comment •

edited

Loading