Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove query pool locks #1472

Merged
merged 1 commit into from
Sep 23, 2024
Merged

Conversation

carver
Copy link
Collaborator

@carver carver commented Sep 20, 2024

What was wrong?

The locks on the query pool are unnecessary. Locks suck. Remove them wherever possible.

How was it fixed?

Moving the query pool into a future::poll seems to be the only place that clone was required. It is relatively straightforward to route around that.

To-Do

@carver carver force-pushed the remove-query-pool-locks branch 2 times, most recently from 9b95cc3 to 66cd187 Compare September 20, 2024 23:49
Comment on lines 494 to 505
if let Poll::Ready(result) = Self::generate_query_event_poll(queries) {
result
} else {
// sleep forever
std::future::pending().await
}
}

fn generate_query_event_poll<TQuery: Query<NodeId>>(
queries: &mut QueryPool<NodeId, TQuery, TContentKey>,
) -> Poll<QueryEvent<TQuery, TContentKey>> {
match queries.poll() {
Copy link
Member

@KolbyML KolbyML Sep 20, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

https://github.com/sigp/discv5/blame/682b51e317216b1e2ec3d75b541cdff66d27cf32/src/service.rs#L1547-L1562

All of this RFC/RFN code is originally based from sigp/discv5 implementation

Suggested change
if let Poll::Ready(result) = Self::generate_query_event_poll(queries) {
result
} else {
// sleep forever
std::future::pending().await
}
}
fn generate_query_event_poll<TQuery: Query<NodeId>>(
queries: &mut QueryPool<NodeId, TQuery, TContentKey>,
) -> Poll<QueryEvent<TQuery, TContentKey>> {
match queries.poll() {
future::poll_fn(move |_cx| match queries.poll() {

Can't we just do this?

Sigp doesn't use arc<mutex>'s either if you view the link I sent

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably? I'll keep working on it. This PR is definitely still a WIP. I just wanted to get it posted early because I remember Milos saying he might work on lock removal, and I want to avoid duplicate work.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just brought it up because

            // sleep forever
            std::future::pending().await

^ Tastes weird,

and I think it can just be fully avoided by using future::poll_fn(move |_cx| match queries.poll() { instead (which is what sigp/discv5 does and they stopped using a lock here 4ish years ago)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh I totally agree that that first implementation was not ideal. (it is worth checking, but my understanding is that the code is functionally equivalent, though)

The most important thing I wanted to do is see if the full test suite passes when the locks are gone. It looks like going back to the poll_fn implementation still works (can't remember what led me away from it at first), so the PR will almost certainly end up with that approach.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good 🚀

@carver
Copy link
Collaborator Author

carver commented Sep 20, 2024

@morph-dev you mentioned you were looking into other lock removal. I stumbled into this one, in the process of working on recursive find content. It feels unfinished: I probably want to refactor query_event_poll and its helper some more. But I figured I'd give an early heads up, in case you were starting to work on the same thing.

@carver carver force-pushed the remove-query-pool-locks branch from 66cd187 to bf3d383 Compare September 21, 2024 00:17
@carver
Copy link
Collaborator Author

carver commented Sep 21, 2024

For reference, the locks got added in #809

I don't remember the conversation at all, but apparently I didn't like them back then either 😆 : #809 (comment)

I took a peek at the PR, but didn't dive deep. I don't understand why they were added back then, really.

@njgheorghita any sense of why they were added, or if removing them could be a problem? Or does it all look good to you?

Also, you mentioned that you made an issue to remove the locks, but I didn't see it. Is there an issue somewhere that should close with this?

@carver carver marked this pull request as ready for review September 21, 2024 00:33
@carver carver requested a review from njgheorghita September 21, 2024 00:33
@carver carver self-assigned this Sep 21, 2024
Copy link
Member

@KolbyML KolbyML left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:shipit:

@morph-dev
Copy link
Collaborator

Thanks for the heads up. I was planning into looking into kbuckets, but my plan was not to remove the lock, but refactor them in a way so that they can't cause deadlock.

Regarding this change, I'm a bit worried that without lock, there could be some ugly race conditions. On one side, I believe that rust would prevent that, but on the other hand I'm a bit paranoid when it comes to multi-threading. It could also be due to my not so extensive familiarity with OverlayService (which is quite complex), so feel free to ignore my worry :).

Copy link
Collaborator

@njgheorghita njgheorghita left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me! Honestly, can't exactly remember the rationale for introducing the locks. I'm sure I had one, but it's not obvious that it was a good one since I looked back over that pr and couldn't figure it out. I'm guessing the "issue" I was referring to is just the pr description from #809, where you can see that I listed this as a follow-up and then never followed up 😅

@carver carver merged commit 42947e4 into ethereum:master Sep 23, 2024
9 checks passed
@carver carver deleted the remove-query-pool-locks branch September 23, 2024 19:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants