Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Continue searching for content when an earlier transfer of content fails, or is invalidated #1467

Merged
merged 1 commit into from
Sep 24, 2024

Conversation

carver
Copy link
Collaborator

@carver carver commented Sep 19, 2024

What was wrong?

Fix #1383

How was it fixed?

  • send a new message back to the query when the transfer fails, or the validation fails
  • resume the search for content when the new message arrives
  • expand the existing test to verify that content is eventually found, even if it sometimes is invalid

To-Do

If a find content query is transferring content from a peer, and it
fails, then resume pinging peers to get the content from another one.

Ways we have seen the content transfer fail:
- peer disappears during utp transfer
- peer sends invalid content
- peer implements utp differently

For now, partially to simplify testing and architecture, we only
validate one piece of content at a time. When a piece of content is
returned, we prioritize starting validation. When validation has
started, we pause all other activity: finding new peers, finding the
content elsewhere, and validating more copies of already-found content.
Those things resume only after the active validation fails.

It could be nice to run validation on 2 or more copies of content at a
time, but testing this would be much more complicated (at least with the
current testing approach).
@@ -56,15 +56,15 @@ pub enum FindContentQueryPending<TNodeId> {
// peer that sent the content
peer: TNodeId,
// channel used to send back the validated content info
valid_content_tx: Sender<ValidatedContent<TNodeId>>,
valid_content_tx: Sender<Option<ValidatedContent<TNodeId>>>,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We spoke about how to differentiate between a failed utp and failed validation within the trace, and you mentioned it would be tough to send the trace between threads. I think we could instead pass back a wrapper datatype enum here with ValidatedContent, FailedUtp, and FailedValidation variants instead of None on failure.

Not a blocker on this PR.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup, I see where you're going.

As a preview of where my mind is at: since the query trace is not accessible from inside the query, I think the distinction between the types of failure might not matter to the query itself. The trace will have to know the distinction, but that will be out in the overlay service somewhere.

@carver carver merged commit b28beff into ethereum:master Sep 24, 2024
9 checks passed
@carver carver deleted the rfc-retries branch September 24, 2024 21:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Robust RecursiveFindContent | Continue the search if data transfer fails
2 participants