-
Notifications
You must be signed in to change notification settings - Fork 863
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Possible corrupted table cache info on application start #3520
Comments
Unfortunately we got the same error on another table.. This application was ported from ECS to EKS last week and it's allocated on another K8s namespace. Versions: |
Hello @Sussumu, Thank you for reporting the issue. I tried to reproduce it at my end using the code snippet you provided, but unfortunately, I was unable to do so. However, you rightly pointed out that the issue appears to be non-reproducible at will. I will discuss and review this further with the team to understand the root cause of the problem. Thanks again for bringing this to our attention. Regards, |
@Sussumu Your first example looks like it is using LocalStack, was your second incident also using LocalStack? You are correct by default the |
@normj Oh it was just an example, in production we are using a regular AWS account for both cases. We're still experiencing this issue though it's not common. Maybe less than once a week. |
Describe the bug
We recently faced a bug in production where the application would not load any document, stating that the number of hash keys was different than one. This application has been running for a few months with no changes whatsoever so we thought this was some kind of unwanted infrastructure change. After a restart, everything came back to normal.
I didn't put a lot of time investigating the AWS SDK code, but from what I could see, the code checks for the number of hash keys declared by the application which has to be exactly one. It gets this data from a previously cached value which may come from a
DescribeTable
call or from the code itself depending on the value of theDisableFetchingTableMetadata
. Our code didn't explicitly set this attribute so it may have come from aDescribeTable
call. Please correct if I'm wrong.Is it possible that this call may have corrupt data?
Regression Issue
Expected Behavior
The application was supposed to query an document from its partition key and sort key as it was doing for a few months.
Current Behavior
We inject a
IDynamoDBContext
and load the document like this:Since the restart we didn't face any more errors like this.
Reproduction Steps
I've just copied the most important parts. There's nothing special about this configuration and we basically copy/paste to another projects with no problem. I can't reproduce it now. Maybe if some background call like the
DescribeTable
that I've mentioned is altered we can get the same error.Possible Solution
As I said, I think it's related to the underlying
DescribeTable
. I assume that disablingDisableFetchingTableMetadata
and manually specifying the keys may correct this since it's one less moving part.Additional Information/Context
The bug started after a Kubernetes pod restart after a node change. All other pods including other ones that query DynamoDb on the same account restarted but only this one got the bug.
AWS .NET SDK and/or Package version used
AWSSDK.DynamoDBv2 Version="3.7.300.12"
AWSSDK.Extensions.NETCore.Setup Version="3.7.300"
AWSSDK.SecretsManager Version="3.7.301.11"
AWSSDK.SecurityToken Version="3.7.300.22"
Targeted .NET Platform
.NET 7.0
Operating System and version
Custom Alpine x64 image
The text was updated successfully, but these errors were encountered: