You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
At scales of 1000+ peers on a single GoBGP server we've noticed that the rate to bring up additional peers starts slowing in proportion to the number of peers and the number of routes each peer is advertising to the server.
As an example case, at 2500 iBGP peers, 1 route advertised per peer and the server configured as a route reflector it takes nearly 5 minutes for all to successfully reach an established state. If the number of routes is increased this time increases proportionally. During this bringup as well GoBGP API performance starts to suffer and it can take 10s of seconds to be able to run say a gobgp neighbor command.
From what we can see the bottleneck seems to be that all events are run through a single Goroutine here: https://github.com/osrg/gobgp/blob/master/pkg/server/server.go#L487-L505 - So every new route received, every API request, and every time a peer wants to transition from one state it exits and awaits this Goroutine to handle it and start a new peer goroutine.
Whilst routes and API requests aren't necessarily a problem if they are slower to be processed, the peer state transitions handled via this routine are. Under the above mentioned load the routine seems to develop quite a significant backlog, and with smaller hold timers can take so long to process a change from say OPENSENT to ESTABLISHED that the other end expires the hold timer.
As well there seems to be a not-insigificant amount of load caused by the use of reflection each iteration to process these events. We see upwards of 15-20% of CPU time alone here during this type of load. Dropping this for a single channel to the server (as in all peers shared the same peer -> server channel) provided a noticeable improvement, but of course at the cost of it no longer randomising across all peers.
It seems possible to allow the peer FSM to advance from state-to-state without requiring it to exit and have the server restart it each time, eliminating the bottleneck entirely. A bit of rework would likely be needed to ensure all the functionality in handleFSMMessage works if the peer is running independently, but it should not be a breaking API change as far as we can see.
Would you be open to a PR that implements this?
The text was updated successfully, but these errors were encountered:
I'm not sure whether Go is the right choice if you handle such lots of peers. As you said, there is some room to improve in GoBGP with more complexity. But I'm not sure GoBGP can compete with C or Rust BGP implementations.
At scales of 1000+ peers on a single GoBGP server we've noticed that the rate to bring up additional peers starts slowing in proportion to the number of peers and the number of routes each peer is advertising to the server.
As an example case, at 2500 iBGP peers, 1 route advertised per peer and the server configured as a route reflector it takes nearly 5 minutes for all to successfully reach an established state. If the number of routes is increased this time increases proportionally. During this bringup as well GoBGP API performance starts to suffer and it can take 10s of seconds to be able to run say a
gobgp neighbor
command.From what we can see the bottleneck seems to be that all events are run through a single Goroutine here: https://github.com/osrg/gobgp/blob/master/pkg/server/server.go#L487-L505 - So every new route received, every API request, and every time a peer wants to transition from one state it exits and awaits this Goroutine to handle it and start a new peer goroutine.
Whilst routes and API requests aren't necessarily a problem if they are slower to be processed, the peer state transitions handled via this routine are. Under the above mentioned load the routine seems to develop quite a significant backlog, and with smaller hold timers can take so long to process a change from say OPENSENT to ESTABLISHED that the other end expires the hold timer.
As well there seems to be a not-insigificant amount of load caused by the use of reflection each iteration to process these events. We see upwards of 15-20% of CPU time alone here during this type of load. Dropping this for a single channel to the server (as in all peers shared the same peer -> server channel) provided a noticeable improvement, but of course at the cost of it no longer randomising across all peers.
It seems possible to allow the peer FSM to advance from state-to-state without requiring it to exit and have the server restart it each time, eliminating the bottleneck entirely. A bit of rework would likely be needed to ensure all the functionality in
handleFSMMessage
works if the peer is running independently, but it should not be a breaking API change as far as we can see.Would you be open to a PR that implements this?
The text was updated successfully, but these errors were encountered: