-
Notifications
You must be signed in to change notification settings - Fork 654
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
STS Client doesn't refresh IAM Permissions #2332
Comments
Are you sure your sts provider isn't being wrapped in a cache? All credentials providers loaded from the "chain" (so I'm not seeing this behavior with an explicitly instantiated assumerole provider, though. If I create a client like so: s3c := s3.NewFromConfig(cfg, func(o *s3.Options) {
// the root credentials that power AssumeRole calls are static from config
o.Credentials = stscreds.NewAssumeRoleProvider(sts.NewFromConfig(cfg), roleARN)
}) put that client in a The difference between cache and otherwise also explains why the CLI works as you expect, your two invocations there are entirely separate processes, there's no cache to speak of (they don't persist to disk with things like that to my knowledge). |
Hey @lucix-aws , thanks for the quick and thorough response! I'm going to take a deep dive into this to see what I can do. In the meantime - the place where I'm seeing this just so happens to be in a public repo that I can share for additional context. The call to The place in the same file where I've written my custom retry: https://github.com/AlexVulaj/backplane-cli/blob/backplane-assume-implement-retry/pkg/awsutil/sts.go#L123 So I think your theory about my provider being wrapped in a cache is true. What I've written now "works", however it would be nice if I could rewrite it to not recreate the sts client each time. |
Is there a recommended way to flush the cache or force it to refresh? I tried the following and unfortunately had no luck either: credsProvider := credentials.NewStaticCredentialsProvider(creds.AccessKeyID, creds.SecretAccessKey, creds.SessionToken)
credsProvider.Value.CanExpire = true
credsProvider.Value.Expires = time.Now().Add(5 * time.Second)
...
config.WithCredentialsProvider(credsProvider) |
Those are the static credentials that are used to make the assume role call though if I understand correctly. The credentials that need to not be cached are the ones retrieved from |
Thanks for your continued help on this - it's really appreciated. What you're saying makes sense. And yes, that's correct - the credentials retrieved from The flow here is a bit complicated, but I'll do my best to shed some more light here... I have 3 roles: 1, 2 and 3, that assume into each other in order. Each role is in a separate AWS account. Role 1 and Role 2 have static policies that allow Role 1 to assume Role 2. Role 2's permissions are dynamically updated by an external service to be able to assume Role 3. Role 3 has a trust relationship that always allows Role 2 to assume it. My flow looks like:
I believe the problem I'm running into is that at the time of step 5, the update from step 1 hasn't yet propagated. Because of that and the caching you mentioned earlier, the client created in step 5 will always fail its next call no matter how long I retry for. I've gotten around that by repeating both steps 5 and 6 until my call works, but at this point I'm wondering if there's a better way than recreating the client repeatedly. |
Hi @AlexVulaj, This is quite a convoluted setup, but I have an idea of what this is happening here. When you say:
I assume this external service operates in an asynchronous fashion while the rest of that Go code you have there is synchronous. The reason why are not running into this in the CLI is that you take a manual step here that is synchronous:
This will also explain why sleeping the main thread would insure enough time has passed for the async plugin to complete before the client is initialized. It will also explain why retrying doesn't work; The SDK caches the credentials before the role policy is updated. I'm not sure why you are updating the role policy dynamically but from a security standpoint, this raises some red flags for me. Thanks, |
Thanks for the response @RanVaknin The code I'm working on is for a CLI. It makes a call to a backend API that updates the IAM permissions, and then returns back to the CLI. To summarize the context of the product and our use case - this relates to how we (Red Hat) will manage access to AWS accounts for managed ROSA customers. We go through this process to make sure that permission is only given at the time at which it's requested, and only to the individual requesting it. I'll note that this was a requirement from AWS to manage access this way. That all said, it sounds like the solution I'm going with currently fits our needs best. I appreciate both of your time! |
Closing for now since we know there's no behavioral issue SDK-side. |
|
|
Describe the bug
I have an
sts.Client
configured with credentials for a given IAM Role. If the IAM permissions of the Role are updated elsewhere (e.g. via the web console) thests.Client
never seems to pick up those changes no matter how long I retry. I can recreate mysts.Client
in the event of this error and manually retry, but I'd expect the client to handle this on its own.Expected Behavior
I would expect the IAM Permission changes to affect my Go client after a few seconds. To demonstrate this using the AWS. CLI, I did the following:
aws ec2 describe-instances
, this gives me anUnauthorizedOperation
error as I expected.aws ec2-describe instances
again. This fails at first but after a few seconds successfully returns results.I would expect this same behavior of the
sts.Client
in the Go sdk.Current Behavior
I can retry for multiple minutes, but my
sts.Client
never seems to have the new permissions. I know the permissions have propagated because if I add a shorttime.Sleep
before creating my client, everything works fine.Reproduction Steps
It's hard to provide an exact code snippet because this seems to be due to a race condition, but doing my best here:
I've also tried configuring a retryer for the client that explicitly lists 403 HTTP responses as a retryable error:
No matter how long of a backoff I add or how many attempts, the request continues to fail.
Possible Solution
The standard retry mechanism in
sts.Client
should be able to detect this error automatically and refresh the IAM permissions.Additional Information/Context
My specific use case is making
AssumeRole
calls quickly in sequence that span multiple AWS accounts. An external service updates the trust policies/IAM Permissions for this chain ofAssumeRole
calls to be executed. There seems to be a race condition where I have to wait for the IAM changes to propagate before I can even create mysts.Client
. I'm not sure how I can accurately tell when those changes are recognized by AWS, so I'm left to retry by creating my sts Client over and over.AWS Go SDK V2 Module Versions Used
Compiler and Version used
go version go1.20.4 darwin/arm64
Operating System and version
macOS Ventura 13.6
The text was updated successfully, but these errors were encountered: