Merge pull request #1 from MinaFoundation/sventimir/e2e-happy-path

End-to-end test - happy path
MinaFoundation · Mar 20, 2024 · 426d51b · 426d51b
2 parents 08ef54e + e41c015
commit 426d51b
Show file tree

Hide file tree

Showing 16 changed files with 825 additions and 0 deletions.
diff --git a/e2e_test/__init__.py → __init__.py b/e2e_test/__init__.py → __init__.py
diff --git a/blockchain_mock/README.md b/blockchain_mock/README.md
@@ -0,0 +1,54 @@
+Blockchain mock
+===============
+
+This script requires an uptime-service already running. It should be
+configured such that it does not verify signatures on submissions. We
+want to test the logic of the uptime service, not signature
+verification. The mock is run as follows:
+
+    $ python generate_submissions.py --block-s3-bucket <bucket> \
+        --block-s3-dir <folder name> <uptime service URL>
+
+The mock contains a list of 15 hard-coded public keys, which serve as
+block producers' addresses. It loads the list of blocks at the given
+address and forms the chain of state hashes. It also rotates the list
+of block producers so that it never ends. It sets the block pointer to
+the first block on the list.
+
+Then every minute it picks up the next node and sends a submission
+with the block currently pointed to by the block pointer. Every 3 minutes
+it also moves the block pointer to the next block. This way every block
+producer appears to make a submission every 15 minutes and each block
+gets submitted by 3 distinct block producers.
+
+These parameters can be tweaked using command line parameters:
+
+* `--block-time` followed by an integer defines the interval in seconds
+  after which the system proceeds to the next block.
+* `--submission-time` followed by an integer defines the interval
+  in seconds after which the system proceeds with the next submission.
+
+The mock can also be run without downloading blocks from s3. In this
+case `--block-s3-bucket` and `--block-s3-dir` parameters should be
+replaces with `--block-dir` pointing to a local directory containing
+blocks as described above.
+
+The values of `--block-time` and `--submission-time` do not have any
+formal constraints assigned to them, but they are expressed as an
+integral number of seconds. Setting the `--submission-time` to 0
+will cause the script to send submissions as fast as possible and
+actual submission times will depend on machine's and network
+connection's throughput, so it's not recommended.
+
+`--block-time` is only checked before the next submission is going to
+be made. For this reason setting it to a value smaller or equal to
+`--submission-time` will have the effect that for every submission a
+new block is picked. Every block will be submitted at least once
+nonetheless, even if `--block-time` is much smaller than
+`--submission-time`, so there is not really any point in choosing a
+value smaller than `--submission-time`.
+
+Note that choosing too low value of `--block-time` relative to
+`--submission-time` may result in uptime service refusing to score any
+points, because it expects to see the same blocks submitted by
+multiple peers.
diff --git a/blockchain_mock/__init__.py b/blockchain_mock/__init__.py
diff --git a/blockchain_mock/block_reader.py b/blockchain_mock/block_reader.py
@@ -0,0 +1,28 @@
+"""This module defines a generic block reader, which cannot read blocks,
+just contains the general policy of iterating over them."""
+
+
+class BlockReader:
+    """General logic of returning blocks in sequence, regardless of
+    how they're stored or read."""
+
+    def __init__(self):
+        "Initialize."
+        self.blocks = None
+        self.current_state_hash = None
+        self.current_block_data = None
+
+    def __iter__(self):
+        self.blocks = self.read_block_list()
+        return self
+
+    def __next__(self):
+        self.current_state_hash = next(self.blocks)
+        self.current_block_data = None
+        return self.current_state_hash
+
+    def read_block_list(self):
+        """Read the list of block state hashes to process.
+        This is a dummy implementation, returning an empty
+        iterator. To be overridden in subclasses."""
+        return iter(())
diff --git a/blockchain_mock/data.py b/blockchain_mock/data.py
@@ -0,0 +1,42 @@
+"This module holds some hard-coded testing data."
+
+# Some dummy public keys. Their corresponding private keys are
+# irrelevant for this test.
+BP_KEYS = [
+    "B62qrx9d59WXHARNxQjMy4Eb1i9SRpwuH8gcuxM6dkHnAgTXcN6dDzf",
+    "B62qrhKjqf3jMWbtoM4VHqUAY5M3gE2v6Wm4L2dpw36GxtxPBzJPsyV",
+    "B62qo5uVVat4XfqWUk9EKE18ZHUpxiDn7zoksrZxrZSXYQxFAroSTzu",
+    "B62qiqkuPiZyNNJ2vcxx9m85dhGYzoSUCtnwL9YnqvSiK2TJMuXjvP5",
+    "B62qksawzNzjzn9CfqwqgsuJgf23aDUu5d6i8mYLTDEE4cXEjALEYxK",
+    "B62qr6pi7s78Kk9WNvPWPdSUSY6TTs84DzAzKKkLwsKkgbfudmN7Wd3",
+    "B62qpqhAdPrtMos2bmghGaKF9xjcdXiXFnxj51rqgFynv1bKbJuD8U7",
+    "B62qqywGkLD9TGMh9bxG9yJHpoZVbg1jSbY6gdzYejAZwSv42MAMrj5",
+    "B62qrAxCoBbtwDVn3SBkhwLeQqKK31xdnGuNkxPCZURnU2f3n3njqWJ",
+    "B62qk3T3xFms7iG5Qh5jE2xdESvCzVnfbxSMPKzhoj5eUiUdsa6k2eS",
+    "B62qkEqfYYa1o6uw7chgF4VcoqJJXjMriybGuARHR3vr2Ny3Nvz3UTM",
+    "B62qioo2qyrK4VEoubjba6mTM5GKTEcBAuqjetb4ACs9gH9VgJMDfYS",
+    "B62qjZaNVRxHTF763F15NQxnx9wqBcu5AHpaV2BvzUgEvi7zMRdP8fv",
+    "B62qnQHMbVt4ZqNinuigazUEGwYdQeuiAvBPu3NcuwPd6PZ3PdzkEfa",
+    "B62qnPaJTWs2ZcJwn5g53w7NvT3kEuK87z7pzuNyPyGShHkRmKBw1w9",
+    "B62qoj5CqvomLKMbocS8PupSkB69gHCEE8HN27ToqZQuNSAHao9GUZa"
+]
+
+# Some dummy libp2p peer ids.
+LIBP2P_PEER_IDS = [
+    "12D3KooWAmMKPQKfF8bbUN7gjNSHvbrQ8NY5JhJ1qWCnMBzvJdh5",
+    "12D3KooWRuvBs2QNyE1TD1vfVqAXKbU3qwUvSJTDRWesgpvT4sxw",
+    "12D3KooWDe8Sq3CEp7HXJ9aAPghuvKT32PWBWpeSiQwXModSZ2A5",
+    "12D3KooWFG3XauGMv6xtAodKraS9FWGMPRMrTvjKeuHU21axerYW",
+    "12D3KooWJjna99EYehJgpYGUffTtgSMjiSeZ5UHXfuBdh9yGSnjN",
+    "12D3KooWCEcj6994dvi8nHofHgvHdK6PRi9USqE8dNSVCfaXLQdC",
+    "12D3KooWRBzi4zpPjssakh2uXADRMMk7X8vBtpCEsPuZW6spwgib",
+    "12D3KooWBg12daDfEZoASq7jAQAQn4E1d5RNhhD9MQN1AocpQR6x",
+    "12D3KooWLHyNwbiyKiTHTJTL28jYVzQ24zUKNSiiLYVZpufX4ubs",
+    "12D3KooWCtt2WMYnJsty27Bn17Q9vqHFFCKYWYew5WXhmf35X49i",
+    "12D3KooWMsepsE9zXaCcFTeyQsH1TYRJg4J5hVBKripXKRfU6vWm",
+    "12D3KooWDTxtEJ5PmcA4kTeBtDFyVhSaizdUKzx4NKFjhctUrKPV",
+    "12D3KooWPMvmxXtKydU9T52sXPFTpj1d9Pq2qvs6j3qisqsetJ7K",
+    "12D3KooWRxzEAisB3sXuDbsn3ZhBgxMHohFsTjk1wsBxk5J96JhB",
+    "12D3KooWHecvaEeAimF5gJ6FKBEcB2VcyLk3L7ynh7PgFX2VkPAJ",
+    "12D3KooWEQQxABEeYyX7DGLjDuEWTtNvEmN9UAPWWKTzxfMaFZQJ"
+]
diff --git a/blockchain_mock/generate_submissions.py b/blockchain_mock/generate_submissions.py
@@ -0,0 +1,120 @@
+"""This module generates submissions to the uptime service from an
+imaginary blockchain. There are 15 block producers sending submissions
+in 1-minute intervals from one another. This way every block on the
+blockchain is submitted 3 times. These blocks are pre-generated and
+SNARK work proofs in submissions are dummy, as they are not produced
+on a real blockchain. The goal is to run the uptime service validation
+against these blocks and submissions and see that:
+  - every BP scores 100% of available points;
+  - blocks form a smooth, uninterrupted chain and there are no forks."""
+
+import argparse
+from datetime import datetime, timedelta, timezone
+import itertools
+import json
+import sys
+import time
+
+import requests
+
+from data import BP_KEYS, LIBP2P_PEER_IDS
+from local_block_reader import LocalBlockReader
+from s3_block_reader import S3BlockReader
+from network import NODES
+
+
+class Scheduler:
+    """The scheduler mocks the behaviour of a real blockchain. It is
+    given a list of block producers and a list of blocks. It then
+    cycles over the block producers, returning them one by one, in
+    intervals of the submission_time. It also keeps track of the
+    current block, switching it every block_time, taking the next
+    block from the provided list. When that list is exhausted,
+    iteration stops."""
+
+    def __init__(self, nodes, block_reader,
+                 block_time=timedelta(minutes=3),
+                 submission_time=timedelta(minutes=1)):
+        "Initialize the scheduler."
+        self.block_reader = block_reader
+        self.nodes = itertools.cycle(nodes)
+        self.block_time = block_time
+        self.submission_time = submission_time
+        self.next_block = None
+        self.next_submission = None
+
+    def __iter__(self):
+        "Initialize an iteration."
+        now = datetime.now(timezone.utc)
+        now.replace(second=0, microsecond=0)
+        self.next_block = now + self.block_time
+        self.next_submission = now + self.submission_time
+        # initialize iteration on block reader and select the first block
+        iter(self.block_reader)
+        next(self.block_reader)
+        return self
+
+    def __next__(self):
+        "Return the next scheduled submission."
+        now = datetime.now(timezone.utc)
+        if now >= self.next_block:
+            self.next_block += self.block_time
+            # at some point this will raise StopIteration
+            # which we allow to propagate to terminate the
+            # scheduling
+            next(self.block_reader)
+
+        if now < self.next_submission:
+            time.sleep((self.next_submission - now).total_seconds())
+
+        self.next_submission += self.submission_time
+        return next(self.nodes)
+
+    def read_block(self):
+        "Use the block reader to extract more block data."
+        return self.block_reader.read_block()
+
+    @property
+    def current_block(self):
+        "Return the state hash of the current block."
+        return self.block_reader.current_state_hash
+
+
+def parse_args():
+    "Parse command line options."
+    p = argparse.ArgumentParser()
+    p.add_argument("--block-dir", help="Directory with block files.")
+    p.add_argument("--block-s3-bucket", help="S3 bucket where blocks are stored.")
+    p.add_argument("--block-s3-dir", help="S3 directory where blocks are stored.")
+    p.add_argument("--block-time", default=180, type=int, help="Block time in seconds.")
+    p.add_argument("--submission-time", default=60, type=int,
+                   help="Interval between subsequent submissions.")
+    p.add_argument("uptime_service_url")
+    return p.parse_args()
+
+def main(args):
+    """Generate submissions for the uptime service."""
+    if args.block_dir is not None:
+        block_reader = LocalBlockReader(args.block_dir)
+    elif args.block_s3_dir is not None and args.block_s3_bucket is not None:
+        block_reader = S3BlockReader(args.block_s3_bucket, args.block_s3_dir)
+    else:
+        raise RuntimeError("No block storage provided!")
+
+    scheduler = Scheduler(
+        NODES,
+        block_reader,
+        block_time=timedelta(seconds=args.block_time),
+        submission_time=timedelta(seconds=args.submission_time)
+    )
+    for node in scheduler:
+        sub = node.submission(scheduler.read_block())
+        now = datetime.now(timezone.utc)
+        print(f"{now}: Submitting block {scheduler.current_block} for {node.public_key}...")
+        r = requests.post(args.uptime_service_url, json=sub, timeout=15.0)
+        json.dump(r.json(), sys.stdout, indent=2)
+    print("Done.")
+
+
+if __name__ == "__main__":
+    main(parse_args())
diff --git a/blockchain_mock/local_block_reader.py b/blockchain_mock/local_block_reader.py
@@ -0,0 +1,31 @@
+"This module is concerned with reading blocks from as local storage (file system)."
+
+import base64
+import os
+
+from block_reader import BlockReader
+
+
+class LocalBlockReader(BlockReader):
+    "Read blocks from local file system."
+
+    def __init__(self, block_dir):
+        "Initialize."
+        super().__init__()
+        self.block_dir = block_dir
+
+    def read_block_list(self):
+        "Read the list of block state hashes to process."
+        block_list = os.path.join(self.block_dir, "block_list.txt")
+        with open(block_list, "r", encoding="utf-8") as fp:
+            return (l.strip() for l in fp.readlines())
+
+    def read_block(self):
+        """Read the current block's data from disk and cache for
+        future reuse."""
+        if self.current_block_data is None:
+            filename = f"{self.current_state_hash}.dat"
+            path = os.path.join(self.block_dir, filename)
+            with open(path, "rb") as fp:
+                self.current_block_data = base64.b64encode(fp.read()).decode("utf-8")
+        return self.current_block_data
diff --git a/blockchain_mock/network.py b/blockchain_mock/network.py
@@ -0,0 +1,32 @@
+"A mock for a real Mina network."
+
+from dataclasses import dataclass
+from datetime import datetime, timezone
+
+from data import BP_KEYS, LIBP2P_PEER_IDS
+
+
+@dataclass
+class Node:
+    """A Node is a simple bundle of a block producer's key and corresponding
+    lilp2p peer id. It's capable of generating submission in the name od that
+    node."""
+    peer_id: str
+    public_key: str
+
+    def submission(self, block):
+        """Create a new submission. Actually make it a method of an object
+        containing a BP pub key and a peer_id."""
+        now = datetime.now(timezone.utc)
+        return {
+            "submitter": self.public_key,
+            "signature": "7mX1kSj74K1FVnNrRhDMabMshRA2iNadA5Q5ikqh95FAE3Hi4o6fQUQzgHmuacLk7ZZh9evh1FwAzMe1JwCycr5PZQ3RoXZf",
+            "data": {
+                "block": block,
+                "created_at": now.strftime('%Y-%m-%dT%H:%M:%SZ'),
+                "peer_id": self.peer_id,
+                "snark_work": None
+            }
+        }
+
+NODES = list(Node(peer_id, bp) for bp, peer_id in zip(BP_KEYS, LIBP2P_PEER_IDS))
diff --git a/blockchain_mock/s3_block_reader.py b/blockchain_mock/s3_block_reader.py
@@ -0,0 +1,37 @@
+"""This module is concerned with downloading blocks to submit from S3."""
+
+import base64
+import boto3
+
+from block_reader import BlockReader
+
+
+class S3BlockReader(BlockReader):
+    "Read blocks from local file system."
+
+    def __init__(self, s3_bucket, prefix):
+        "Initialize."
+        super().__init__()
+        self.bucket = s3_bucket
+        self.prefix = prefix
+        self.client = boto3.client("s3")
+
+    def read_block_list(self):
+        "Read the list of block state hashes to process."
+        block_list_resp = self.client.get_object(
+            Bucket=self.bucket,
+            Key=f"{self.prefix}/block_list.txt"
+        )
+        return (bs.decode("utf8").strip() for bs in block_list_resp["Body"].readlines())
+
+    def read_block(self):
+        """Read the current block's data from disk and cache for
+        future reuse."""
+        if self.current_block_data is None:
+            block_resp = self.client.get_object(
+                Bucket=self.bucket,
+                Key=f"{self.prefix}/{self.current_state_hash}.dat"
+            )
+            self.current_block_data = base64.b64encode(block_resp["Body"].read()).decode("utf-8")
+
+        return self.current_block_data
diff --git a/db_migration/README.md b/db_migration/README.md
@@ -0,0 +1,36 @@
+# Migrate the uptime service database.
+
+The delegation program that the Foundation ran before the hard fork will continue after the hard fork. Because computations of the uptime service are recursive in nature (output from the previous round influences how the next round will go), we need to migrate some data from the old service to the new one to serve as input for the first round. We’re not so much concerned with keeping historic record – such a record will be kept in the form of final database snapshot. Rather, we’re concerned with providing initial state of the service to kick it off properly. To this end we run a specialised migration script, whose task is to:
+
+- Trim the database, leaving behind only data relevant for the past 3 months (after 1st December 2023).
+- Migrate it to the format expected by the new uptime service.
+
+Difference between the old database schema and the new one are not large. There’s a couple of tables that have been dropped as well as a few columns in existing tables. One column needs to be added in an existing table – the score_history table lacks a primary key. The migration script linked below takes care of all that.
+
+[migrate.sql](./migrate.sql)
+
+This script is intended to be run on a **copy** of the ontab database. Because it performs DELETE commands on some tables, it’s best to preserve a backup of the original database for safety, but also to preserve historic data, which might yet be needed in the future. This copy of the Ontab database will be provided as a database snapshot from AWS RDS service. It is important to have the snapshot encrypted with a key MF has access to so that we can restore the database from this snapshot on our end to run the script above.
+
+NOTE: this will take several hours to execute (a lot of rows to drop and constraints to check). When the script is done, take a snapshot of the resulting database as a backup and you can hook the uptime service coordinator up to it,
+
+## Taking a db diff
+
+Under some circumstances (like e.g. an emergency hard fork) there might not be enough time to take a full Ontab database’s snapshot quickly (it takes several hours to produce) and then trim it (which takes another several hours). In that case a quicker approach can be used to synchronise the databases – the script linked below can be used to take a partial db dump (for increased speed) from the original database and move relevant records to the new database. The script can be run like this:
+
+```bash
+$ python db_diff.py -H $HOST -p $PORT -U $USERNAME -w $PASSWORD -d leaderboard_snark $DATE
+```
+
+[db_diff.py](./db_diff.py)
+
+Where:
+
+- `$HOST` is the hostname of the original ontab database (delegation-uptime-ontab-4.c14zvdudnyw7.us-west-2.rds.amazonaws.com at the moment of writing this document)
+- `$PORT` is port of the service (defaults to 5432)
+- `$USERNAME` is the user that can read the database (we’ve graciously been given `ro_user` user to access the database).
+- `$PASSWORD` is the USERNAME’s password for the database.
+- `$DATE` is the day from which we want to take the diff (e.g. 2024-02-01). Only records added after that date will be dumped.
+
+NOTE: the script requires psycopg2 package downloaded from pip.
+
+This script will produce some SQL commands on the standard output. Write it to a file and run the script against the target database in order to load the data. Or you can simply pipe it to the `psql` command.