Skip to content

Latest commit

 

History

History
51 lines (36 loc) · 2.71 KB

data.md

File metadata and controls

51 lines (36 loc) · 2.71 KB

Data processing notes

In order to generate input to the validate_and_apply function, a lot of data needs to be gathered.

ChainState and Block

Generating ChainState and Block data involves joining information between multiple blocks and transactions. Since this kind of operations is slow with Bitcoin RPC we use Google Bitcoin data set which allows us to export data with plain sql. Unfortunately due to the missing transaction_index bug in the data set it can't be the only source of data.

client

Input data is processed in multiple steps:

  1. previous_timestamps.sql and previous_utxos.sql queries dump data into GCS
  2. Timestamp data dump is processed by generate_timestamp_data.py script. Data is downloaded from GCS and index files are created. Index maps block number to per block timestamp related data. Index is broken down into smaller files, in order to be quickly loaded into the memory.
  3. Utxo data dump is by generate_utxo_data.py script: is downloaded from GCS, data files are broken down into smaller chunks, each chunk contains data about several blocks. Index files are created. Index maps block number to a chunk file. Index is broken down into smaller files.
  4. After data dump processing is complete functions get_timestamp_data and get_utxo_set give access to the per block data.
  5. Script generate_data generates data that can be consumed by the validate_and_apply function.

UtxoSet

tbd

Utreexo data

In order to generate Utreexo states and batch proofs for every block in the Bitcoin history we need to use a special Bridge node.

Install

cargo install --git https://github.com/Davidson-Souza/bridge.git --no-default-features --features shinigami

Configure

You need to configure connection to the Bitcoin RPC via environment variables:

export BITCOIN_CORE_RPC_URL=http://localhost:8332
export BITCOIN_CORE_RPC_USER=username
export BITCOIN_CORE_RPC_PASSWORD=password

Run

Run via screen, nohup, or set up a systemd service:

~/.cargo/bin/bridge

You can access per-block data at ~/.bridge/blocks/<bucket>/<block_height>.json:

  • Bucket size is 10k blocks;
  • The data directory can be changed by setting the DATA_DIR environment variable.