New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

(WIP) Add a draft spec for RMN OffChain Blessing #1043

Closed

rstout wants to merge 2 commits into ccip-develop from rs/commit_plugin_rmn_ocb_spec

Contributor

rstout commented Jun 17, 2024

Motivation

Solution

rstout requested a review from a team as a code owner

June 17, 2024 20:30

rstout temporarily deployed to sdlc

June 17, 2024 20:30

— with

GitHub Actions Inactive

rstout force-pushed the rs/commit_plugin_rmn_ocb_spec branch from d6610f5 to 2223947 Compare

June 17, 2024 21:58

rstout temporarily deployed to sdlc

June 17, 2024 21:59

— with

GitHub Actions Inactive

rstout force-pushed the rs/commit_plugin_rmn_ocb_spec branch from 2223947 to 0b67563 Compare

June 18, 2024 20:45

rstout temporarily deployed to sdlc

June 18, 2024 20:45

— with

GitHub Actions Inactive

rstout force-pushed the rs/commit_plugin_rmn_ocb_spec branch from 0b67563 to 80558ae Compare

June 19, 2024 14:45

rstout temporarily deployed to sdlc

June 19, 2024 14:46

— with

GitHub Actions Inactive

rstout force-pushed the rs/commit_plugin_rmn_ocb_spec branch from 80558ae to a1d27cf Compare

June 19, 2024 15:00

rstout temporarily deployed to sdlc

June 19, 2024 15:00

— with

GitHub Actions Inactive

rstout force-pushed the rs/commit_plugin_rmn_ocb_spec branch from a1d27cf to 31ec17f Compare

June 24, 2024 22:46

rstout temporarily deployed to sdlc

June 24, 2024 22:46

— with

GitHub Actions Inactive

makramkd reviewed

View reviewed changes

Contributor

makramkd left a comment •

edited

Loading

Overall makes sense but is a bit tough to follow. There is an implicit state machine in the existing impl that basically matches what you have laid out here:

Determine intervals to be used in the next round
If previous outcome is not nil, read messages from readable chains, include those in the observation.
In outcome, agree on intervals.

re: your point here:

the complications that can arise if a report is not successfully transmitted (as we explicitly only continue once we know the previous report has been committed).

This is typically what ShouldAccept and ShouldTransmit are used for, it seems a bit odd to me to have states regarding report acceptance onchain.

Some things that would help me personally:

State machine diagram with clear start and terminal states
Handling the nil outcome in the spec code

core/services/ocr3/plugins/ccip/spec/commit_plugin_sm_draft.py Outdated Show resolved Hide resolved

core/services/ocr3/plugins/ccip/spec/commit_plugin_sm_draft.py Outdated Show resolved Hide resolved

core/services/ocr3/plugins/ccip/spec/commit_plugin_sm_draft.py Outdated Show resolved Hide resolved

core/services/ocr3/plugins/ccip/spec/commit_plugin_sm_draft.py Outdated Show resolved Hide resolved

core/services/ocr3/plugins/ccip/spec/commit_plugin_sm_draft.py Outdated Show resolved Hide resolved

core/services/ocr3/plugins/ccip/spec/commit_plugin_sm_draft.py Outdated Show resolved Hide resolved

core/services/ocr3/plugins/ccip/spec/commit_plugin_sm_draft.py Outdated Show resolved Hide resolved

core/services/ocr3/plugins/ccip/spec/commit_plugin_sm_draft.py Outdated Show resolved Hide resolved

dimkouv reviewed

View reviewed changes

core/services/ocr3/plugins/ccip/spec/commit_plugin_sm_draft.py Outdated Show resolved Hide resolved

core/services/ocr3/plugins/ccip/spec/commit_plugin_sm_draft.py

+                          case ReportGenerated(signed_intervals):
+                              return CommitReport(signed_intervals)
+                          case _:
+                              return None

Contributor

dimkouv Jun 28, 2024

We also need to figure out how token/gas prices fit in this design.

core/services/ocr3/plugins/ccip/spec/commit_plugin_sm_draft.py Outdated

+                      pass
+                  # TODO: doc
+                  def query(self, previous_outcome: CommitOutcome) -> CommitQuery:

Contributor

dimkouv Jun 28, 2024

We are missing validation,shouldTransmit,shouldAccept phases, is it a followup?

Contributor

dimkouv Jun 28, 2024 •

edited

Loading

For example we need to define what's happening after e.g. shouldAccept returns false. In that case and based on the previous outcome state we'll keep waiting (until reaching retries limit) for the report to be committed even though this will never happen.

Contributor Author

rstout Jun 28, 2024

I haven't thought much about validation, shouldTransmit, shouldAccept honestly. I don't think there's much we can do besides exhaust the "max committed seq num" checks. How acceptable that is depends on how often we expect shouldTransmit/shouldAccept to fail.

dimkouv reviewed

View reviewed changes

core/services/ocr3/plugins/ccip/spec/commit_plugin_sm_draft.py Outdated Show resolved Hide resolved

rstout requested review from a team, elatoskinas, RayXpub and jasonmci as code owners

June 28, 2024 18:43

rstout force-pushed the rs/commit_plugin_rmn_ocb_spec branch from 9df28ae to 977d264 Compare

June 28, 2024 18:47

rstout removed request for a team, jasonmci, elatoskinas and RayXpub

June 28, 2024 18:48

dimkouv reviewed

View reviewed changes

core/services/ocr3/plugins/ccip/spec/commit_plugin_sm_draft.py

		@@ -0,0 +1,382 @@
		"""

Contributor

dimkouv Jul 1, 2024

maybe move PR to https://github.com/smartcontractkit/chainlink-ccip

rstout force-pushed the rs/commit_plugin_rmn_ocb_spec branch from 977d264 to 3132f5a Compare

July 1, 2024 17:12

rstout temporarily deployed to sdlc

July 1, 2024 17:12

— with

GitHub Actions Inactive

connorwstein reviewed

View reviewed changes

core/services/ocr3/plugins/ccip/spec/commit_plugin_sm_draft.py Outdated

+                    - Determine intervals to be used in the next round
+. BuildingReport
+                    - Build a report from the intervals determined in the previous round
+. CheckingForUpdatedMaxCommittedSeqNums

Collaborator

connorwstein Jul 1, 2024

Might be easier to follow the states if we name them in relation to the state of the report? Like SelectingMessagesForReport, BuildingReport, WaitingForReportInclusion

Contributor Author

rstout Jul 1, 2024

Agreed

core/services/ocr3/plugins/ccip/spec/commit_plugin_sm_draft.py Outdated

+                              observed_merkle_roots = self.flatten_merkle_root_observations(observations)
+                              signed_intervals = self.get_signed_intervals_to_report(intervals, query.signed_intervals,
+                                                                                     observed_merkle_roots)
+                              return ReportGenerated(signed_intervals)

Collaborator

connorwstein Jul 1, 2024

I think its possible that signed_intervals ends up being empty (unlikely, but say briefly >f_chain nodes goes down in either RMN or CCIP after providing a set of intervals). For that case I think we'd want to go back to ChoosingIntervals to start gathering potentially a larger batch.

Contributor Author

rstout Jul 2, 2024

Absolutely, great call out!

core/services/ocr3/plugins/ccip/spec/commit_plugin_sm_draft.py

+                      pass
+                  # The OCR3 implementation of Outcome
+                  def outcome(

Collaborator

connorwstein Jul 1, 2024

maybe worth noting somewhere that if this returns an error repeatedly, we could get stuck in a state as if outcome errors the next round has the same prevOutcome. However since its a pure function we expect to fully control the error scenarios

core/services/ocr3/plugins/ccip/spec/commit_plugin_sm_draft.py

+                  # have different merkle roots for the same chain, this chain is not included in the output. Additionally, if there
+                  # are chains that don't require RMN support, these chains will be in observed_merkle_roots but not
+                  # rmn_signed_intervals, and will be included in the output (with an empty set of RMN signatures).
+                  def get_signed_intervals_to_report(

Collaborator

connorwstein Jul 1, 2024

I think this function is worth spelling out in the spec. You'd need CCIP f_chain and RMN f_chain defined on class or as inputs. We can include them in config for now (can just copy in CommitConfig in the commit_plugin.py)

Contributor Author

rstout Jul 2, 2024

I think it'll be simpler to implement this first in the Go code and then port it over

core/services/ocr3/plugins/ccip/spec/commit_plugin_sm_draft.py

+                      pass
+                  # Verify the RMN signatures on the given signed_intervals
+                  def verify_signed_intervals(

Collaborator

connorwstein Jul 1, 2024

maybe to be called inside of get_signed_intervals_to_report? Then if an sig doesn't validate we exclude that observation but don't error

core/services/ocr3/plugins/ccip/spec/commit_plugin_sm_draft.py

+                  ) -> Dict[ChainSelector, SignedInterval]:
+                      pass
+                  # Given a list of SequenceNumbersObservation, return a flattened consensus on the max committed sequence number

Collaborator

connorwstein Jul 1, 2024

worth noting same logic as here

ccip/core/services/ocr3/plugins/ccip/commit/plugin_functions.go

Line 356 in 0648a92

func maxSeqNumsConsensus(

core/services/ocr3/plugins/ccip/spec/commit_plugin_sm_draft.py Show resolved Hide resolved

rstout added 2 commits

July 2, 2024 11:09


          (WIP) Add a draft spec for RMN OffChain Blessing

2081d06


          Add explicit commit plugin states, more outcomes, and a state machine…

367a8b4

… diagram

rstout force-pushed the rs/commit_plugin_rmn_ocb_spec branch from 3132f5a to 367a8b4 Compare

July 2, 2024 18:09

rstout had a problem deploying to sdlc

July 2, 2024 18:09

— with

GitHub Actions Failure

kaleofduty reviewed

View reviewed changes

Contributor

kaleofduty left a comment

nice! here are some initial questions

core/services/ocr3/plugins/ccip/spec/commit_plugin_sm_draft.py

+                  def request_max_seq_nums(
+                          self,
+                          chains: List[ChainSelector]
+                  ) -> Dict[ChainSelector, int]:

Contributor

kaleofduty Jul 3, 2024

Should these be signed?

core/services/ocr3/plugins/ccip/spec/commit_plugin_sm_draft.py



		@dataclass
		class SignedInterval:

Contributor

kaleofduty Jul 3, 2024

I find this name a little confusing, it's the root that's signed, no?

core/services/ocr3/plugins/ccip/spec/commit_plugin_sm_draft.py

+                                  return ReportNotYetTransmitted(previous_max_committed_seq_nums, attempts + 1)
+                  # The OCR3 implementation of Report
+                  def report(self, outcome: CommitOutcome) -> Optional[CommitReport]:

Contributor

kaleofduty Jul 3, 2024

Would it be interesting to split into multiple reports in case they get too large?

Collaborator

connorwstein Jul 3, 2024

yeah potentially once we have enough chains s.t. roots can't easily fit in one report (long way away though)

core/services/ocr3/plugins/ccip/spec/commit_plugin_sm_draft.py

Comment on lines +372 to +373

		observed_merkle_roots = self.flatten_merkle_root_observations(observations)
		signed_intervals = self.get_signed_intervals_to_report(intervals, query.signed_intervals,

Contributor

kaleofduty Jul 3, 2024

where do we check that only nodes that have been assigned the role of reading from a chain include it in its intervals?

core/services/ocr3/plugins/ccip/spec/commit_plugin_sm_draft.py

+                          # If we are choosing the next intervals this round, we need to query RMN for the max uncommitted sequence
+                          # numbers it has for each source chain, so we can set appropriate upper ranges for our intervals.
+                          case SelectingIntervalsForReport():
+                              rmn_max_seq_nums = self.rmn_client.request_max_seq_nums(self.all_source_chains)

Contributor

kaleofduty Jul 3, 2024

is this sync or async?

Collaborator

connorwstein Jul 3, 2024

discussed live - we want async because:

Avoids long query timeout, which leads to longer deltaProgress (slowing OCR leader rotation)
Can do the network calls themselves in parallel

core/services/ocr3/plugins/ccip/spec/commit_plugin_sm_draft.py

+                      pass
+                  # The OCR3 implementation of Outcome
+                  def outcome(

Contributor

kaleofduty Jul 3, 2024

double checking,this function is pure, right?

core/services/ocr3/plugins/ccip/spec/commit_plugin_sm_draft.py

+                              return SequenceNumbersObservation(self.get_max_committed_seq_nums(), {})
+                  # Given a list of MerkleRootObservations, return a flattened consensus on the merkle root for each source chain
+                  def flatten_merkle_root_observations(self, observations: List[CommitObservation]) -> Dict[ChainSelector, bytes]:

Contributor

kaleofduty Jul 3, 2024

how does this function work internally?

Collaborator

connorwstein Jul 3, 2024

max voted on root which has at least f+1 votes

core/services/ocr3/plugins/ccip/spec/commit_plugin_sm_draft.py

+                          # exhausted, or return ReportNotYetTransmitted with an incremented "attempts" value otherwise
+                          case WaitingForReportTransmission(previous_max_committed_seq_nums, attempts):
+                              max_committed_seq_nums = self.flatten_max_committed_seq_nums_observations(observations)
+                              if self.max_committed_seq_nums_are_updated(max_committed_seq_nums, previous_max_committed_seq_nums):

Contributor

kaleofduty Jul 3, 2024

does this evaluate to true whenever they change, or only when they match what we transmitted?

core/services/ocr3/plugins/ccip/spec/commit_plugin_sm_draft.py

+                  dest_chain: ChainSelector
+                  chain_readers: Dict[ChainSelector, ChainReader]
+                  f: int
+                  max_check_report_persisted_attempts: int

Contributor

kaleofduty Jul 3, 2024

consider making this wall clock time?
consider checking vs SharedConfig

Contributor

github-actions bot commented Aug 18, 2024

This PR is stale because it has been open 45 days with no activity. Remove stale label or comment or this will be closed in 10 days.

github-actions bot added the Stale label

Contributor

github-actions bot commented Aug 29, 2024

This PR was closed because it has been stalled for 10 days with no activity.

github-actions bot closed this

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels