Fix bench-e2e single mode and keep results (#1693)

This fixes two issues with the `bench-e2e` binary / benchmark: * Running in `single` mode was not working because of a `FeeTooSmallUTxO` error * The `results.csv` is written into a temporary directory and removed, which makes plotting impossible. I was in the mood of some refactoring so this contains also various other changes I encountered while working on the code and I was [tidying up](https://tidyfirst.substack.com/p/the-life-changing-magic-of-tidying) a bit. The refactoring separated hydra node and payment keys further, which requires the datasets to be re-generated. I took the freedom to generate with `--scaling-factor 10` which results in `300` transactions per client. Should be long enough to identify regressions, with **hopefully 10x shorter benchmark time** in CI. Another benefit of this separation is that it naturally led to reducing the assumptions of the `demo` mode by not seeding the hydra node cardano keys, but re-using `seed-devnet.sh` and consequently looser coupling between the workload and container setup in our network test workflow. I'm not 100% happy with how the bench is now requiring the `--output-directory` to be empty, and in turn the whole state will be captured as an artifact of our CI. Instead, making the state directory always a /tmp path and retained in case of errors (or configurable with `--state-directory`) would be better. But that can go into another PR .. another time. --- * [x] CHANGELOG updated * [x] Documentation updatedx (README) * [x] Haddocks updated * [x] No new TODOs introduced or explained herafter - Two XXX notes of what to improve further
cardano-scaling · Oct 10, 2024 · 321167e · 321167e
2 parents dff6655 + ec21ac0
commit 321167e
Show file tree

Hide file tree

Showing 18 changed files with 384 additions and 419 deletions.
diff --git a/.github/workflows/network-test.yaml b/.github/workflows/network-test.yaml
@@ -69,28 +69,14 @@ jobs:
         cd demo
         ./prepare-devnet.sh
         docker compose up -d cardano-node
-        sleep 5
+        sleep 2
         # :tear: socket permissions.
-        sudo chown runner:docker devnet/node.socket
-
-        HYDRA_SCRIPTS_TX_ID=$(nix run .#hydra-node -- publish-scripts \
-          --testnet-magic 42 \
-          --node-socket devnet/node.socket \
-          --cardano-signing-key devnet/credentials/faucet.sk)
-
-        echo "HYDRA_SCRIPTS_TX_ID=$HYDRA_SCRIPTS_TX_ID" > .env
-
-        nix run .#cardano-cli -- query protocol-parameters \
-          --testnet-magic 42 \
-          --socket-path devnet/node.socket  \
-          --out-file /dev/stdout \
-          | jq ".txFeeFixed = 0 | .txFeePerByte = 0 | .executionUnitPrices.priceMemory = 0 | .executionUnitPrices.priceSteps = 0" \
-          > devnet/protocol-parameters.json
+        sudo chmod a+w devnet/node.socket
+        ./seed-devnet.sh "nix run .#cardano-cli --" "nix run .#hydra-node --"
 
         # Specify two docker compose yamls; the second one overrides the
         # images to use the netem ones specifically
         docker compose -f docker-compose.yaml -f docker-compose-netem.yaml up -d hydra-node-{1,2,3}
-        sleep 3
         docker ps
 
     - name: Build required nix and docker derivations
@@ -119,9 +105,6 @@ jobs:
         .github/workflows/network/run_pumba.sh $target_peer $percent $other_peers
 
         # Run benchmark on demo
-        mkdir benchmarks
-        touch benchmarks/test.log
-
         nix run .#legacyPackages.x86_64-linux.hydra-cluster.components.benchmarks.bench-e2e -- \
           demo \
           --output-directory=benchmarks \

diff --git a/CHANGELOG.md b/CHANGELOG.md
@@ -40,7 +40,10 @@ changes.
   - Overall this results in transactions still to be submitted once per client,
     but requires signifanctly less book-keeping on the client-side.
 
-- Add **Blockfrost Mode** to `hydra-chain-observer`, to follow the chain via Blockfrost API.
+- Add blockfrost support to `hydra-chain-observer`, to follow the chain via Blockfrost API.
+
+- Fix `bench-e2e single` benchmarks and only use `--output-directory` to keep
+  the whole benchmark state.
 
 - Add `inlineDatumRaw` to transaction outputs on the `hydra-node` API.
 

diff --git a/hydra-cluster/README.md b/hydra-cluster/README.md
@@ -113,41 +113,55 @@ Run the integration test suite with `cabal test`
 
 # Benchmarks
 
-The benchmark can be run using `cabal bench` and produces a
-`results.csv` file in a work directory. To plot the transaction
-confirmation times you can use the `bench/plot.sh` script, passing it
-the directory containing the benchmark's results.
+The benchmark can be run using `cabal bench` or `cabal run bench-e2e` and
+produces a `results.csv` file in a work directory. To plot the transaction
+confirmation times you can use the `bench/plot.sh` script, passing it the
+directory containing the benchmark's results.
 
 To run and plot results of the benchmark:
 
 ```sh
-$ cabal bench --benchmark-options 'single'
-Running 1 benchmarks...
-Benchmark bench-e2e: RUNNING...
-Writing transactions to: /run/user/1000/bench-83d18973f95a554d/txs.json
-[...]
-Writing results to: /run/user/1000/bench-6b772589d08f82a5/results.csv
-Benchmark bench-e2e: FINISH
-$ bench/plot.sh /run/user/1000/bench-6b772589d08f82a5
-Created plot: /run/user/1000/bench-6b772589d08f82a5/results.png
+cabal run bench-e2e -- single --output-directory out"
+bench/plot.sh out
 ```
 
-Note that if it's present in the environment, benchnark executable will gather basic system-level statistics about the RAM, CPU, and network bandwidth used. The `plot.sh` script then displays those alongside tx confirmation time in a single graph.
+Which will produce an output like:
 
-The benchmark can be run in two modes corresponding to two different commands:
-
-* `single`: Runs a single _dataset_ either freshly generated in some temporary directory or pre-existing. This is useful to either generate data or to run experiments.
-* `datasets`: Runs one or more preexisting _datasets_ in sequence and collect their results in a single markdown formatted file. This is useful to track the evolution of hydra-node's performance over some well-known datasets over time and produce a human-readable summary.
-
-Check out `cabal bench --benchmark-options --help` for more details.
-
-# Network Testing
+```
+Generating dataset with scaling factor: 10
+Writing dataset to: out/dataset.json
+Test logs available in: out/test.log
+Starting benchmark
+Seeding network
+Fund scenario from faucet
+Fuel node key "16e61ed92346eb0b0bd1c6d8c0f924b4d1278996a61043a0a42afad193e5f3fb"
+Publishing hydra scripts
+Starting hydra cluster in out
+Initializing Head
+Comitting initialUTxO from dataset
+HeadIsOpen
+Client 1 (node 0): 0/300 (0.00%)
+Client 1 (node 0): 266/300 (88.67%)
+All transactions confirmed. Sweet!
+Closing the Head
+Finalizing the Head
+Writing results to: out/results.csv
+Confirmed txs/Total expected txs: 300/300 (100.00 %)
+Average confirmation time (ms): 18.747147496
+P99: 23.100851369999994ms
+P95: 19.81722345ms
+P50: 18.532922ms
+Invalid txs: 0
+Writing report to: out/end-to-end-benchmarks.md
+         line 0: warning: Cannot find or open file "out/system.csv"                
+Created plot: out/results.png
+```
 
-The benchmark can be also run over the running `demo` hydra-cluster, using `cabal bench` and produces a
-`results.csv` file in a work directory. Same as for benchmarks results, you can use the `bench/plot.sh` script to plot the transaction confirmation times.
+Note that if it's present in the environment, benchmark executable will gather basic system-level statistics about the RAM, CPU, and network bandwidth used. The `plot.sh` script then displays those alongside tx confirmation time in a single graph.
 
-To run the benchmark in this mode, the command is:
-* `demo`: Runs a single _dataset_ freshly generated and collects its results in a markdown formatted file. The purpose of this setup is to facilitate a variaty of network-resiliance scenarios, such as packet loss or node failures. This is useful to prove the robustness and performance of the hydra-node's network over time and produce a human-readable summary.
+The benchmark can be run in three modes:
 
-For instance, we make use of this in our [CI](https://github.com/cardano-scaling/hydra/blob/master/.github/workflows/network-test.yaml) to keep track for scenarios that we care about.
+* `single`: Generate a single _dataset_ and runs the benchmark with it.
+* `datasets`: Runs one or more pre-existing _datasets_ in sequence and collect their results in a single markdown formatted file. This is useful to track the evolution of hydra-node's performance over some well-known datasets over time and produce a human-readable summary.
+* `demo`: Generates transactions against an already running network of cardano and hydra nodes. This can serve as a workload when testing network-resiliance scenarios, such as packet loss or node failures. See [this CI workflow](https://github.com/cardano-scaling/hydra/blob/master/.github/workflows/network-test.yaml) for how it is used.
 
diff --git a/hydra-cluster/bench/Bench/EndToEnd.hs b/hydra-cluster/bench/Bench/EndToEnd.hs
@@ -22,7 +22,6 @@ import Control.Lens (to, (^..), (^?))
 import Control.Monad.Class.MonadAsync (mapConcurrently)
 import Data.Aeson (Result (Error, Success), Value, encode, fromJSON, (.=))
 import Data.Aeson.Lens (key, values, _JSON, _Number, _String)
-import Data.Aeson.Types (parseMaybe)
 import Data.List qualified as List
 import Data.Map qualified as Map
 import Data.Scientific (Scientific)
@@ -32,18 +31,15 @@ import Data.Time (UTCTime (UTCTime), utctDayTime)
 import Hydra.Cardano.Api (NetworkId, SocketPath, Tx, TxId, UTxO, getVerificationKey, signTx)
 import Hydra.Cluster.Faucet (FaucetLog (..), publishHydraScriptsAs, returnFundsToFaucet', seedFromFaucet)
 import Hydra.Cluster.Fixture (Actor (..))
-import Hydra.Cluster.Scenarios (
-  EndToEndLog (..),
-  headIsInitializingWith,
- )
-import Hydra.Generator (ClientDataset (..), ClientKeys (..), Dataset (..))
+import Hydra.Cluster.Scenarios (EndToEndLog (..))
+import Hydra.Generator (ClientDataset (..), Dataset (..))
 import Hydra.Logging (
   Tracer,
   traceWith,
   withTracerOutputTo,
  )
 import Hydra.Network (Host)
-import Hydra.Tx (HeadId, Party, deriveParty, txId)
+import Hydra.Tx (HeadId, txId)
 import Hydra.Tx.ContestationPeriod (ContestationPeriod (UnsafeContestationPeriod))
 import Hydra.Tx.Crypto (generateSigningKey)
 import HydraNode (
@@ -75,25 +71,15 @@ import Text.Printf (printf)
 import Text.Regex.TDFA (getAllTextMatches, (=~))
 import Prelude (read)
 
-data Event = Event
-  { submittedAt :: UTCTime
-  , validAt :: Maybe UTCTime
-  , invalidAt :: Maybe UTCTime
-  , confirmedAt :: Maybe UTCTime
-  }
-  deriving stock (Generic, Eq, Show)
-  deriving anyclass (ToJSON)
-
 bench :: Int -> NominalDiffTime -> FilePath -> Dataset -> IO Summary
-bench startingNodeId timeoutSeconds workDir dataset@Dataset{clientDatasets} = do
+bench startingNodeId timeoutSeconds workDir dataset = do
   putStrLn $ "Test logs available in: " <> (workDir </> "test.log")
   withFile (workDir </> "test.log") ReadWriteMode $ \hdl ->
     withTracerOutputTo hdl "Test" $ \tracer ->
       failAfter timeoutSeconds $ do
         putTextLn "Starting benchmark"
-        let cardanoKeys = map (\ClientDataset{clientKeys = ClientKeys{signingKey}} -> (getVerificationKey signingKey, signingKey)) clientDatasets
+        let cardanoKeys = hydraNodeKeys dataset <&> \sk -> (getVerificationKey sk, sk)
         let hydraKeys = generateSigningKey . show <$> [1 .. toInteger (length cardanoKeys)]
-        let parties = Set.fromList (deriveParty <$> hydraKeys)
         withOSStats workDir $
           withCardanoNodeDevnet (contramap FromCardanoNode tracer) workDir $ \node@RunningNode{nodeSocket} -> do
             putTextLn "Seeding network"
@@ -103,10 +89,9 @@ bench startingNodeId timeoutSeconds workDir dataset@Dataset{clientDatasets} = do
             putStrLn $ "Starting hydra cluster in " <> workDir
             let hydraTracer = contramap FromHydraNode tracer
             let contestationPeriod = UnsafeContestationPeriod 10
-            withHydraCluster hydraTracer workDir nodeSocket startingNodeId cardanoKeys hydraKeys hydraScriptsTxId contestationPeriod $ \(leader :| followers) -> do
-              let clients = leader : followers
+            withHydraCluster hydraTracer workDir nodeSocket startingNodeId cardanoKeys hydraKeys hydraScriptsTxId contestationPeriod $ \clients -> do
               waitForNodesConnected hydraTracer 20 clients
-              scenario hydraTracer node workDir dataset parties leader followers
+              scenario hydraTracer node workDir dataset clients
 
 benchDemo ::
   NetworkId ->
@@ -129,14 +114,13 @@ benchDemo networkId nodeSocket timeoutSeconds hydraClients workDir dataset@Datas
           Just node -> do
             putTextLn "Seeding network"
             seedNetwork node dataset (contramap FromFaucet tracer)
-            let clientSks = clientKeys <$> clientDatasets
-            (`finally` returnFaucetFunds tracer node clientSks) $ do
+            (`finally` returnFaucetFunds tracer node) $ do
               putStrLn $ "Connecting to hydra cluster: " <> show hydraClients
               let hydraTracer = contramap FromHydraNode tracer
               withHydraClientConnections hydraTracer (hydraClients `zip` [1 ..]) [] $ \case
                 [] -> error "no hydra clients provided"
                 (leader : followers) ->
-                  scenario hydraTracer node workDir dataset mempty leader followers
+                  scenario hydraTracer node workDir dataset (leader :| followers)
  where
   withHydraClientConnections tracer apiHosts connections action = do
     case apiHosts of
@@ -145,40 +129,34 @@ benchDemo networkId nodeSocket timeoutSeconds hydraClients workDir dataset@Datas
         withConnectionToNodeHost tracer peerId apiHost (Just "/?history=no") $ \con -> do
           withHydraClientConnections tracer rest (con : connections) action
 
-  returnFaucetFunds tracer node cKeys = do
+  returnFaucetFunds tracer node = do
     putTextLn "Returning funds to faucet"
     let faucetTracer = contramap FromFaucet tracer
-    let senders = concatMap @[] (\(ClientKeys sk esk) -> [sk, esk]) cKeys
-    mapM_
-      ( \sender -> do
-          returnAmount <- returnFundsToFaucet' faucetTracer node sender
-          traceWith faucetTracer $ ReturnedFunds{actor = show sender, returnAmount}
-      )
-      senders
+    forM (hydraNodeKeys dataset <> (paymentKey <$> clientDatasets)) $ \sk -> do
+      returnAmount <- returnFundsToFaucet' faucetTracer node sk
+      traceWith faucetTracer $ ReturnedFunds{returnAmount}
 
+-- | Runs the benchmark scenario given a list of clients. The first client is
+-- used to drive the life-cycle of the head.
 scenario ::
   Tracer IO HydraNodeLog ->
   RunningNode ->
   FilePath ->
   Dataset ->
-  Set Party ->
-  HydraClient ->
-  [HydraClient] ->
+  NonEmpty HydraClient ->
   IO Summary
-scenario hydraTracer node workDir Dataset{clientDatasets, title, description} parties leader followers = do
+scenario hydraTracer node workDir Dataset{clientDatasets, title, description} nonEmptyClients = do
   let clusterSize = fromIntegral $ length clientDatasets
-  let clients = leader : followers
+  let leader = head nonEmptyClients
+      clients = toList nonEmptyClients
   let totalTxs = sum $ map (length . txSequence) clientDatasets
 
   putTextLn "Initializing Head"
   send leader $ input "Init" []
-  headId <-
-    waitForAllMatch (fromIntegral $ 10 * clusterSize) clients $ \v ->
-      headIsInitializingWith parties v
-        <|> do
-          guard $ v ^? key "tag" == Just "HeadIsInitializing"
-          headId <- v ^? key "headId"
-          parseMaybe parseJSON headId :: Maybe HeadId
+  headId :: HeadId <-
+    waitForAllMatch (fromIntegral $ 10 * clusterSize) clients $ \v -> do
+      guard $ v ^? key "tag" == Just "HeadIsInitializing"
+      v ^? key "headId" . _JSON
 
   putTextLn "Comitting initialUTxO from dataset"
   expectedUTxO <- commitUTxO node clients clientDatasets
@@ -322,18 +300,18 @@ movingAverage confirmations =
 -- | Distribute 100 ADA fuel, starting funds from faucet for each client in the
 -- dataset.
 seedNetwork :: RunningNode -> Dataset -> Tracer IO FaucetLog -> IO ()
-seedNetwork node@RunningNode{nodeSocket, networkId} Dataset{fundingTransaction, clientDatasets} tracer = do
+seedNetwork node@RunningNode{nodeSocket, networkId} Dataset{fundingTransaction, hydraNodeKeys} tracer = do
   fundClients
-  forM_ (clientKeys <$> clientDatasets) fuelWith100Ada
+  forM_ hydraNodeKeys fuelWith100Ada
  where
   fundClients = do
     putTextLn "Fund scenario from faucet"
     submitTransaction networkId nodeSocket fundingTransaction
     void $ awaitTransaction networkId nodeSocket fundingTransaction
 
-  fuelWith100Ada ClientKeys{signingKey} = do
+  fuelWith100Ada signingKey = do
     let vk = getVerificationKey signingKey
-    putTextLn $ "Seed client " <> show vk
+    putTextLn $ "Fuel node key " <> show vk
     seedFromFaucet node vk 100_000_000 tracer
 
 -- | Commit all (expected to exit) 'initialUTxO' from the dataset using the
@@ -342,12 +320,21 @@ commitUTxO :: RunningNode -> [HydraClient] -> [ClientDataset] -> IO UTxO
 commitUTxO node clients clientDatasets =
   mconcat <$> forM (zip clients clientDatasets) doCommit
  where
-  doCommit (client, ClientDataset{initialUTxO, clientKeys = ClientKeys{externalSigningKey}}) = do
+  doCommit (client, ClientDataset{initialUTxO, paymentKey}) = do
     requestCommitTx client initialUTxO
-      <&> signTx externalSigningKey
+      <&> signTx paymentKey
         >>= submitTx node
     pure initialUTxO
 
+data Event = Event
+  { submittedAt :: UTCTime
+  , validAt :: Maybe UTCTime
+  , invalidAt :: Maybe UTCTime
+  , confirmedAt :: Maybe UTCTime
+  }
+  deriving stock (Generic, Eq, Show)
+  deriving anyclass (ToJSON)
+
 processTransactions :: [HydraClient] -> [ClientDataset] -> IO (Map.Map TxId Event)
 processTransactions clients clientDatasets = do
   let processors = zip (zip clientDatasets (cycle clients)) [1 ..]

diff --git a/hydra-cluster/bench/Bench/Options.hs b/hydra-cluster/bench/Bench/Options.hs
@@ -6,17 +6,36 @@ import Hydra.Cardano.Api (NetworkId, SocketPath)
 import Hydra.Chain (maximumNumberOfParties)
 import Hydra.Network (Host, readHost)
 import Hydra.Options (networkIdParser, nodeSocketParser)
-import Options.Applicative (Parser, ParserInfo, auto, command, fullDesc, header, help, helpDoc, helper, hsubparser, info, long, maybeReader, metavar, option, progDesc, short, str, strOption, value)
+import Options.Applicative (
+  Parser,
+  ParserInfo,
+  auto,
+  command,
+  fullDesc,
+  header,
+  help,
+  helper,
+  hsubparser,
+  info,
+  long,
+  maybeReader,
+  metavar,
+  option,
+  progDesc,
+  short,
+  str,
+  strOption,
+  value,
+ )
 import Options.Applicative.Builder (argument)
 import Options.Applicative.Help (Doc, align, fillSep, line, (<+>))
 
 data Options
   = StandaloneOptions
-      { workDirectory :: Maybe FilePath
+      { scalingFactor :: Int
+      , clusterSize :: Word64
       , outputDirectory :: Maybe FilePath
-      , scalingFactor :: Int
       , timeoutSeconds :: NominalDiffTime
-      , clusterSize :: Word64
       , startingNodeId :: Int
       }
   | DatasetOptions
@@ -65,28 +84,10 @@ standaloneOptionsInfo =
 standaloneOptionsParser :: Parser Options
 standaloneOptionsParser =
   StandaloneOptions
-    <$> optional
-      ( strOption
-          ( long "work-directory"
-              <> helpDoc
-                ( Just $
-                    "Directory containing generated transactions, UTxO set, log files for spawned processes, etc."
-                      <> item
-                        [ "If the directory exists, it's assumed to be used for replaying"
-                        , "a previous benchmark and is expected to contain 'txs.json' and"
-                        , "'utxo.json' files,"
-                        ]
-                      <> item
-                        [ "If the directory does not exist, it will be created and"
-                        , "populated with new transactions and UTxO set."
-                        ]
-                )
-          )
-      )
+    <$> scalingFactorParser
+    <*> clusterSizeParser
     <*> optional outputDirectoryParser
-    <*> scalingFactorParser
     <*> timeoutParser
-    <*> clusterSizeParser
     <*> startingNodeIdParser
 
 item :: [Doc] -> Doc