Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

server: Implement separate mutex for peer state. #3251

Merged
merged 13 commits into from
May 16, 2024

Conversation

davecgh
Copy link
Member

@davecgh davecgh commented May 9, 2024

Currently, all operations related to adding, removing, and banning peers are implemented via a single goroutine and channels to protect concurrent access.

The existing implementation has worked well for over a decade, however, it is really not ideal in a few ways:

  • It is difficult to improve parallel processing since everything is forced into a single goroutine
  • It requires a lot of plumbing to provide access to any related information
  • The use of multiple channels means the ordering of events is not entirely well defined.

In regards to the final point about event ordering, one example is that it's possible for a peer to be removed before it was ever added. The surrounding code is aware of these possibilities and handles things gracefully, but it really is not ideal.

In practice, channels are best suited for passing ownership of data, distributing units of work, and communicating async results while mutexes are better suited for caches and state.

Converting all of the code the code related to updating and querying the server's peer state to synchronous code that makes use of a separate mutex to protect it will address the aforementioned concerns. Namely, it:

  • Improves the semantics in regards to the aforementioned ordering
  • Ultimately allows more code to run in parallel in the individual peer goroutines
  • Requires less plumbing for updating and querying the state
  • Makes the state available to calling code so it can make better decisions

This consists of a series of commits to help ease the review process. Each commit is intended to be a self-contained and logically easy to follow change such that the code continues to compile and the tests continue to pass at each step.

See the description of each commit for further details.

Fixes #2694.

@davecgh davecgh added this to the 1.9.0 milestone May 9, 2024
@davecgh davecgh force-pushed the server_peer_state_mtx branch from 866b840 to 7772437 Compare May 9, 2024 23:57
rpcadaptors.go Show resolved Hide resolved
rpcadaptors.go Show resolved Hide resolved
server.go Show resolved Hide resolved
@davecgh davecgh force-pushed the server_peer_state_mtx branch from 7772437 to 0d0fbe7 Compare May 16, 2024 15:02
davecgh added 13 commits May 16, 2024 10:16
Currently, all operations related to adding, removing, and banning peers
are implemented via a single goroutine and channels to protect
concurrent access.

The existing implementation has worked well for over a decade, however,
it is really not ideal in a few ways:

- It is difficult to improve parallel processing since everything is
  forced into a single goroutine
- It requires a lot of plumbing to provide access to any related
  information
- The use of multiple channels means the ordering of events is not
  entirely well defined.

In regards to the final point about event ordering, one example is that
it's possible for a peer to be removed before it was ever added.  The
surrounding code is aware of these possibilities and handles things
gracefully, but it really is not ideal.

In practice, channels are best suited for passing ownership of data,
distributing units of work, and communicating async results while
mutexes are better suited for caches and state.

Converting all of the code the code related to updating and querying the
server's peer state to synchronous code that makes use of a separate
mutex to protect it will address the aforementioned concerns.  Namely,
it:

- Improves the semantics in regards to the aforementioned ordering
- Ultimately allows more code to run in parallel in the individual peer
  goroutines
- Requires less plumbing for updating and querying the state
- Makes the state available to calling code so it can make better
  decisions

Thus, this is part of a series of commits that aims to make that
conversion.

This first commit introduces a separate mutex on the peer state and
updates the code to protect all references to the relevant fields with
that mutex.  This will allow future commits to methodically refactor the
various operations without introducing races.
This updates the server peer connection request field to be safe for
concurrent access by making it an atomic pointer.

It also reorders some of the server peer fields and adds some comments
that call out the concurrency semantics while here.

This is a part of the overall effort to convert all of the code related
to updating and querying the server's peer state to synchronous code
that makes use of a separate mutex to protect it.
This modifies the connectionsWithIP method in the server to make use of
the forAllPeers iterator instead of repeating the same logic for each
peer category.
This refactors the logic related to adding a connected peer to the
server out of the peer handler since it is now protected by the newly
added peer state mutex.

This is a part of the overall effort to convert all of the code related
to updating and querying the server's peer state to synchronous code
that makes use of a separate mutex to protect it.
This refactors the logic related to removing a disconnected peer from
the server out of the peer handler since it is now protected by the
newly added peer state mutex.

This is a part of the overall effort to convert all of the code related
to updating and querying the server's peer state to synchronous code
that makes use of a separate mutex to protect it.
This refactors the logic for querying the number of connected peers out
of the peer handler since it is now protected by the newly added peer
state mutex.

This is a part of the overall effort to convert all of the code related
to updating and querying the server's peer state to synchronous code
that makes use of a separate mutex to protect it.
This refactors the logic for querying the outbound group counts out of
the peer handler since it is now protected by the newly added peer state
mutex.

This is a part of the overall effort to convert all of the code related
to updating and querying the server's peer state to synchronous code
that makes use of a separate mutex to protect it.
This refactors the logic for manually connecting peers out of the peer
handler since it is now protected by the newly added peer state mutex.

This is a part of the overall effort to convert all of the code related
to updating and querying the server's peer state to synchronous code
that makes use of a separate mutex to protect it.
This refactors the logic for canceling a pending connection out of the
peer handler since it no longer needs to be plumbed through the query
channel.

This is a part of the overall effort to convert all of the code related
to updating and querying the server's peer state to synchronous code
that makes use of a separate mutex to protect it.
This refactors the logic for manually removing persistent peers out of
the peer handler since it no longer needs to be plumbed through the
query channel.

This is a part of the overall effort to convert all of the code related
to updating and querying the server's peer state to synchronous code
that makes use of a separate mutex to protect it.
This refactors the logic for querying the persistent peers out of the
peer handler since it is now protected by the newly added peer state
mutex and no longer needs to be plumbed through the query channel..

This is a part of the overall effort to convert all of the code related
to updating and querying the server's peer state to synchronous code
that makes use of a separate mutex to protect it.
This refactors the logic for manually disconnecting peers out of the
peer handler since it no longer needs to be plumbed through the query
channel.

This is a part of the overall effort to convert all of the code related
to updating and querying the server's peer state to synchronous code
that makes use of a separate mutex to protect it.
This removes the query channel and associated handler and plumbing now
that it is no longer used.

This completes the overall effort to convert all of the code related to
updating and querying the server's peer state to synchronous code that
makes use of a separate mutex to protect it.
@davecgh davecgh force-pushed the server_peer_state_mtx branch from 0d0fbe7 to 78bf86b Compare May 16, 2024 15:17
@davecgh davecgh merged commit 78bf86b into decred:master May 16, 2024
2 checks passed
@davecgh davecgh deleted the server_peer_state_mtx branch May 16, 2024 15:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

loss of sync peer candidates, max peers reached
3 participants