diff --git a/crates/oracle/docs/decoding-errors.org b/crates/oracle/docs/decoding-errors.org deleted file mode 100644 index 9667eca7..00000000 --- a/crates/oracle/docs/decoding-errors.org +++ /dev/null @@ -1,81 +0,0 @@ -#+title: Decoding Errors - -How should the Block Oracle and the Epoch Subgraph behave when the subgraph fails to -decode a Message payload /(calldata)/? - -* Epoch Subgraph -The subgraph must signal that an Oracle message for the current epoch is invalid. - -It should create a new =DecodingError= (entity) instance scoped to the transaction it just received. - -The subgraph will not enter on a failed state because of this. - -* Block Oracle -Once it detects a =DecodingError= for the current epoch, the Oracle will change -its behaviour/mode. - -It will transition its internal state from =Valid= to =SubgraphDecodingError= and enter -a [[Preemptive State]], followed by a possible [[Alert State]]. - -** Subgraph Error Monitoring -By introducing the concept of =DecodingErrors= in the Epoch Subgraph, the Block Oracle -is given the responsibility to actively monitor the former during the ongoing epoch -until its state is successfully updated. - -This routine complements the [[Transaction Monitoring]] system. - -** Preemptive State -In this state, the Oracle will: -1. Emit an alert. -2. Build and send a single preemptive =SetBlockNumberForCurrentEpoch= - message. Note that this specific message will not include - unregistered networks, as in compliance with the current Epoch - Subgraph state. -3. Watch if the preemptive message triggered another =DecodingError= in the subgraph for - this epoch. - -** Why? -To prioritize closing allocations. This operational mode will ensure that the current -indexed chains continue to receive updates, regardless of failing to register other -networks. - -** Assumptions -Given the following network sets in the context of Message creation: -- previously registered :: ~O~ /(old)/ -- to be registered :: ~N~ /(new)/ -- to have its chain head updated :: ~O ∩ N~ /(old + new)/ - -The Oracle could have prepared and sent a malformed =[RegisterNetworks { N }, -SetBlockNumberForCurrentEpoch { O ∩ N }]= message block. - -There is a chance that the encoding error was caused by the =RegisterNetworks { N }= -message, so it might still be possible to send a =SetBlockNumberForCurrentEpoch { O }= -message to update the chain head for all previously registered networks. - -** On Success -If no second =DecodingError= for this epoch is detected, then it means that the Oracle: -- succeeded at updating the previously registered networks, and -- failed at registering new networks. - -*** If Left Unattended -In the advent of a new epoch, the Oracle will transition its internal state from -=SubgraphDecodingError= to =Valid=, as if it had forgotten about the =DecodingErrors= -from the previous epoch. Nonetheless, it will quickly fail again for the same reason at -the entering epoch, as it will retry sending the very same =RegisterNetworks { N }= -message that supposedly triggered the error in the first place. - -Effectively, the Oracle will lock itself in it's [[Preemptive State]], incapable of -registering networks, but will try to update the currently registered networks at every -new epoch. - -** On Error -If the preemptive =SetBlockNumberForCurrentEpoch { O }= message triggers a second -=DecodingError= in the Epoch Subgraph at the same epoch, then this means that the -encoding error could have happened for all other message types. - -Thus the Oracle can't recover by itself and will change its mode once more and enter in -its [[Alert State]]. - -** Alert State -It will stop sending messages to the DataEdge contract and will emit alerts -periodically. diff --git a/crates/oracle/docs/event-sourcing.org b/crates/oracle/docs/event-sourcing.org deleted file mode 100644 index 3423207f..00000000 --- a/crates/oracle/docs/event-sourcing.org +++ /dev/null @@ -1,80 +0,0 @@ -#+title: Event Sourcing -#+date: [2022-06-20 Mon 19:19] - -The *Event Source* is the component designated to poll the Epoch Block Oracle's *Indexed Chains* periodically and get their latest block numbers, so it can later assemble *Messages* to be sent in a transaction to the *DataEdge* smart contract. - -* Cycles -The Epoch Block Oracle operations can take place in two different cycles: a longer cycle that encompasses one epoch at the *Protocol Chain* and a smaller cycle that is used to poll all *Indexed Chains*, also called the poll cycle. - -At each poll cycle, the Oracle will interact with all JSON RPC endpoints and try to obtain the latest block number for each *Indexed Chain* and update its in-memory state. - -At the beginning of each epoch cycle, the Oracle will then use the collected block numbers to assemble a new *Message* to the *DataEdge* smart contract. - -[[graphviz/event_sourcing.png]] - -* Indexed Chain Selection -The *Indexed Chains* must be declared in the Block Oracle configuration file as an associative array where =CAIP-2= chain identifiers are the /keys/ and an array of JSON RPC endpoints are the /values/. - -The Oracle will ignore any chain that was *unregistered* by the Epoch Subgraph and will not attempt to interact with their JSON RPC endpoints, nor will it hold information about their latest block numbers. - -* Error Cases -There are several ways where event sourcing can fail. - -** Transport failure -How to deal with failed requests when fetching the latest block number from one, multiple, or all JSON RPC providers for a given chain? - -Since the Event Source will poll the latest block numbers continuously, it could hold the latest valid block number from previous attempts, but what if the latest obtained data is not fresh enough? What if a chain stops receiving updates for a whole epoch? - -*** Proposed Solution -The Oracle uses an exponential backoff retry strategy for all JSON RPC requests to address short-term networking problems. - -Since the Event Source is constantly polling for new block numbers, it holds the latest retrieved information for each chain, so all Messages to the *DataEdge* contract will reference that. - -Suppose all providers for a given chain fail to provide a recent block for an extended period. In that case, the Event Source could fall back to using the previous epoch's block number or removing the network if the situation persists for longer than a specified number of epochs. - -#+begin_quote -[[graphviz/event_sourcing_transport_errors.png]] - -The picture above shows how blocks are selected for the *Indexed Chains* at each epoch. Numbered boxes represent attempts to fetch the latest block numbers at a polling cycle, and failed attempts are marked with an "X". -#+end_quote - -**** Alerting -In case a chain has no updates during a whole epoch cycle, an alert should be emitted. - -** Re-orgs -How to detect and deal with re-orgs among the *Indexed Chains*? - -*** Proposed Solution -The Epoch Block Oracle will not seek to detect re-orgs. - -Instead, it will wait a given amount of time to allow for re-orgs and only submit a probabilistically final, "re-org safe" block. That block should be distant from the chain head and recognized by the majority of the providers. - -**** Keeping a distance from the chain head -Each chain must be configured with a number expressing the distance from the chain head. This number will be subtracted from all polled block numbers from all JSON RPC requests. - -** Chain Head Consensus -How to obtain consensus if the latest block number diverges between the JSON RPC providers for the same chain? - -*** Proposed Solution -Since polled block numbers can vary between providers for the same network, the Event Source must seek consensus on the right block to select. - -The consensus check verifies that the *block hash* for the earliest block received is consistent across the simple majority of the providers. This routine will vary according to the number of configured JSON RPC endpoints for each network: - -+ One :: - Naturally, there will be no consensus check if a network is configured with a single provider. -+ Two :: - Between two divergent block numbers, both providers must agree on the block hash of the earliest block. -+ Three or more :: - A set with more than three block numbers will have its outliers filtered using [[https://en.wikipedia.org/wiki/Outlier#Tukey's_fences][Tukey's fences method]], and providers must agree on the block hash of the oldest block in the set. - -If providers do not consent on the selected block, then previous blocks will be selected and verified until consensus is reached under a time limit, in which case the polling operation will be considered as failed. - -**** Alerting -A tolerable block distance can be configured for each chain to trigger a warning in case providers return widely differing block numbers. -+ For two providers, this value will be interpreted as the absolute difference between returned block numbers. -+ For three or more providers, it will be used as the number of standard deviations. - -* References -- [[https://github.com/ChainAgnostic/CAIPs/blob/master/CAIPs/caip-2.md][CAIP-2 - Blockchain ID Specification]] -- [[https://en.wikipedia.org/wiki/Interquartile_range#Outliers][Interquartile Range]] -- [[https://en.wikipedia.org/wiki/Outlier#Tukey's_fences][Outlier detection using Tukey's fences method]] diff --git a/crates/oracle/docs/graphviz/event_sourcing.dot b/crates/oracle/docs/graphviz/event_sourcing.dot deleted file mode 100644 index 5314e513..00000000 --- a/crates/oracle/docs/graphviz/event_sourcing.dot +++ /dev/null @@ -1,51 +0,0 @@ -digraph event_soucing_cycles { - // General Graph Display - rankdir=LR - fontname="Helvetica,Arial,sans-serif" - edge [fontname="Helvetica,Arial,sans-serif"] - node [fontname="Helvetica,Arial,sans-serif", shape=rect, - style=filled, fillcolor="lightgray"] - - // Node Definitions - subgraph cluster_poll_cycle { - label = "Polling Cycle" - EventSource [label="Event\nSource", fillcolor=darkolivegreen1] - - ChainA [label="Chain A", fillcolor=gold] - ChainB [label="Chain B", fillcolor=steelblue] - - ProviderA1 [label="Provider A1", fillcolor=wheat] - ProviderA2 [label="Provider A2", fillcolor=wheat] - ProviderA3 [label="Provider A3", fillcolor=wheat] - - ProviderB1 [label="Provider B1", fillcolor=lightsteelblue] - ProviderB2 [label="Provider B2", fillcolor=lightsteelblue] - ProviderB3 [label="Provider B3", fillcolor=lightsteelblue] - } - - subgraph cluster_epoch_cycle { - label = "Epoch Cycle" - Message [label="Assemble\nMessage"] - BroadcastTransaction [label="Broadcast\nTransaction"] - } - - // Edges - ChainA -> ProviderA1 [arrowhead=none] - ChainA -> ProviderA2 [arrowhead=none] - ChainA -> ProviderA3 [arrowhead=none] - - ChainB -> ProviderB1 [arrowhead=none] - ChainB -> ProviderB2 [arrowhead=none] - ChainB -> ProviderB3 [arrowhead=none] - - ProviderA1 -> EventSource - ProviderA2 -> EventSource - ProviderA3 -> EventSource [label="Latest Block\nNumber", fontsize=11] - ProviderB1 -> EventSource - ProviderB2 -> EventSource - ProviderB3 -> EventSource - - EventSource -> Message [label="Latest\nBlocks"] - Message -> BroadcastTransaction - -} diff --git a/crates/oracle/docs/graphviz/event_sourcing.png b/crates/oracle/docs/graphviz/event_sourcing.png deleted file mode 100644 index adbdc492..00000000 Binary files a/crates/oracle/docs/graphviz/event_sourcing.png and /dev/null differ diff --git a/crates/oracle/docs/graphviz/event_sourcing_transport_errors.dot b/crates/oracle/docs/graphviz/event_sourcing_transport_errors.dot deleted file mode 100644 index 71450748..00000000 --- a/crates/oracle/docs/graphviz/event_sourcing_transport_errors.dot +++ /dev/null @@ -1,79 +0,0 @@ -digraph event_sourcing_transport_errors { - // General Graph Display - newrank=true; - rankdir=LR - fontname="Helvetica,Arial,sans-serif" - edge [fontname="Helvetica,Arial,sans-serif"] - node [fontname="Helvetica,Arial,sans-serif", shape=rect, - style=filled, fillcolor="whitesmoke"] - - subgraph cluster_chain_c { - label="Chain C\nLatest Block: Same as the previous epoch" - style=filled - color=grey88 - CC1 [label="", style=invis] - CC2 [label="1 x", fontcolor=red] - CC3 [label="2 x", fontcolor=red] - CC4 [label="3 x", fontcolor=red] - CC5 [label="4 x", fontcolor=red] - CC6 [label="5 x", fontcolor=red] - CC7 [label="6 x", fontcolor=red] - CC8 [label="7 x", fontcolor=red] - CC9 [label="", style=invis] - CC2 -> CC3 -> CC4 -> CC5 -> CC6 -> CC7 -> CC8 - } - - subgraph cluster_chain_b { - label="Chain B\nLatest Block: 5" - style=filled - color=grey88 - CB1 [label="", style=invis] - CB2 [label="1 ✓"] - CB3 [label="2 ✓"] - CB4 [label="3 ✓"] - CB5 [label="4 x", fontcolor=red] - CB6 [label="5 ✓"] - CB7 [label="6 x", fontcolor=red] - CB8 [label="7 x", fontcolor=red] - CB9 [label="", style=invis] - CB2 -> CB3 -> CB4 -> CB5 -> CB6 -> CB7 -> CB8 - } - - subgraph cluster_chain_a { - label="Chain A\nLatest Block: 6" - style=filled - color=grey88 - CA1 [label="", style=invis] - CA2 [label="1 ✓"] - CA3 [label="2 ✓"] - CA4 [label="3 ✓"] - CA5 [label="4 ✓"] - CA6 [label="5 ✓"] - CA7 [label="6 ✓"] - CA8 [label="7 x", fontcolor=red] - CA9 [label="", style=invis] - CA2 -> CA3 -> CA4 -> CA5 -> CA6 -> CA7 -> CA8 - } - - subgraph cluster_protocol_chain { - label="Protocol Chain" - style=filled - color=grey78 - E1 [label="Epoch\nStart"] - E2 [label="Epoch\nEnd"] - E1 -> E2 [arrowhead=none] - } - - // Alignment - { rank=same; E1; CA1; CB1; CC1} - { rank=same; E2; CA9; CB9; CC9} - - // Invisible edges - edge[style=invis] - CA1 -> CA2 [style=invis] - CA8 -> CA9 [style=invis] - CB1 -> CB2 [style=invis] - CB8 -> CB9 [style=invis] - CC1 -> CC2 [style=invis] - CC8 -> CC9 [style=invis] -} diff --git a/crates/oracle/docs/graphviz/event_sourcing_transport_errors.png b/crates/oracle/docs/graphviz/event_sourcing_transport_errors.png deleted file mode 100644 index 2bc134b2..00000000 Binary files a/crates/oracle/docs/graphviz/event_sourcing_transport_errors.png and /dev/null differ