Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ensure a 120 second 'provider timeout' with each bmclib client #167

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

splaspood
Copy link

Description

It has recently been discovered that bmclib's handling of timeouts, combined with the multiple providers leads to a number of failed actions initiated by pbnj. This change increases the timeout to 120 seconds which is clearly far longer than a timeout needs to be in this instance.

Why is this needed

While troubleshooting this issue it was discovered that bmclib can take 20-30 seconds performing actions via redfish. This timeout increase is meant as a bandaid fix while we investigate the performance issue(s) in bmclib.

@splaspood splaspood force-pushed the fs-1816-timeouts-like-woah-0 branch from a3e34d1 to f58bee2 Compare October 10, 2024 16:01
Copy link
Member

@nshalman nshalman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems fine. I didn't check if you had missed any other locations that might also need the increased timeout, though.

@@ -381,6 +385,9 @@ func (m Action) DeactivateSOL(ctx context.Context) error {
span.SetAttributes(attribute.String("bmc.host", host), attribute.String("bmc.username", user))
m.SendStatusMessage("working on SOL session deactivation")

ctx, cancel := context.WithTimeout(ctx, 120*time.Second)
defer cancel()

opts := []bmclib.Option{
bmclib.WithLogger(m.Log),
bmclib.WithPerProviderTimeout(common.BMCTimeoutFromCtx(ctx)),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To confirm - there is currently an option being passed in which sets a timeout for each of the providers, is this not working?
bmclib.WithPerProviderTimeout(common.BMCTimeoutFromCtx(ctx))

With the current change - if a provider takes too long on the connection it leaves no time for the others.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants