Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Relay timeouts #918

Merged
merged 43 commits into from
Nov 25, 2024
Merged

Relay timeouts #918

merged 43 commits into from
Nov 25, 2024

Conversation

cody-littley
Copy link
Contributor

@cody-littley cody-littley commented Nov 20, 2024

Why are these changes needed?

Adds proper timeouts to relay operations.

Checks

  • I've made sure the tests are passing. Note that there might be a few flaky tests, in that case, please comment that they are not relevant.
  • I've checked the new test coverage and the coverage percentage didn't drop.
  • Testing Strategy
    • Unit tests
    • Integration tests
    • This PR is not tested :(

Signed-off-by: Cody Littley <[email protected]>
Signed-off-by: Cody Littley <[email protected]>
Signed-off-by: Cody Littley <[email protected]>
Signed-off-by: Cody Littley <[email protected]>
Signed-off-by: Cody Littley <[email protected]>
Signed-off-by: Cody Littley <[email protected]>
Signed-off-by: Cody Littley <[email protected]>
Signed-off-by: Cody Littley <[email protected]>
Signed-off-by: Cody Littley <[email protected]>
Signed-off-by: Cody Littley <[email protected]>
Signed-off-by: Cody Littley <[email protected]>
Signed-off-by: Cody Littley <[email protected]>
Signed-off-by: Cody Littley <[email protected]>
Signed-off-by: Cody Littley <[email protected]>
Signed-off-by: Cody Littley <[email protected]>
Signed-off-by: Cody Littley <[email protected]>
Signed-off-by: Cody Littley <[email protected]>
Signed-off-by: Cody Littley <[email protected]>
Signed-off-by: Cody Littley <[email protected]>
Signed-off-by: Cody Littley <[email protected]>
Signed-off-by: Cody Littley <[email protected]>
Signed-off-by: Cody Littley <[email protected]>
Signed-off-by: Cody Littley <[email protected]>
Signed-off-by: Cody Littley <[email protected]>
Signed-off-by: Cody Littley <[email protected]>
Signed-off-by: Cody Littley <[email protected]>
Signed-off-by: Cody Littley <[email protected]>
@cody-littley cody-littley self-assigned this Nov 20, 2024
Signed-off-by: Cody Littley <[email protected]>
Signed-off-by: Cody Littley <[email protected]>
@cody-littley cody-littley marked this pull request as ready for review November 21, 2024 15:24
@cody-littley cody-littley mentioned this pull request Nov 21, 2024
5 tasks
relay/auth/authenticator.go Show resolved Hide resolved
relay/auth/authenticator_test.go Outdated Show resolved Hide resolved
relay/blob_provider.go Show resolved Hide resolved
relay/cache/cached_accessor.go Outdated Show resolved Hide resolved
relay/chunk_provider.go Show resolved Hide resolved
Signed-off-by: Cody Littley <[email protected]>
"sync"
)

// CachedAccessor is an interface for accessing a resource that is cached. It assumes that cache misses
// are expensive, and prevents multiple concurrent cache misses for the same key.
type CachedAccessor[K comparable, V any] interface {
// Get returns the value for the given key. If the value is not in the cache, it will be fetched using the Accessor.
Get(key K) (V, error)
// If the context is cancelled, the function may abort early. If multiple goroutines request the same key,
// cancellation of one request will not affect the others.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: the cancellation of the first request can affect others (all of them get error)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The way I implemented this class, cancellation of the first request should not cause there to be an error for the following requests. Do you see a bug in the implementation?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I think it's about how to read "not affect", the failure of fetch is not the failure of first request, then this comment is correct. Anyway, it's fine to keep this.

}
result.wg.Add(1)
result.sem.TryAcquire(1)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This always succeeds, should it just Acquire()?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will always succeed either way, since this happens right after the semaphore is created and since no other goroutine has access to the semaphore yet. The reason why I use TryAcquire() is that it doesn't return an error. If I use Acquire() then I have to add error handling, even though it's impossible for an error to be returned.

With this in mind, are you ok with TryAcquire(1) here?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does _ = result.sem.Acquire() work (with comment that this should always succeed)?

TryAquire() returns a boolean, so it's the same in the sense that the return value is ignored.

So both will work, the difference I think is that TryAcquire reads like it doesn't care about whether it has grabbed the resource.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've switched this to _ = result.sem.Acquire(context.Background(), 1)

@cody-littley cody-littley merged commit ac7ffdd into Layr-Labs:master Nov 25, 2024
6 checks passed
@cody-littley cody-littley deleted the relay-timeouts branch November 25, 2024 14:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants