Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DEVPROD-13056: fix s3 put skip_existing failures #8510

Merged
merged 1 commit into from
Nov 26, 2024

Conversation

drichmdb
Copy link
Contributor

DEVPROD-13056

Description

This commit fixes s3.put's skip_existing parameter to not always fail when a remote file already exists. At the end of s3.put, we check that the length of uploadedFiles matches the length of fileList. If the lengths aren't equal, the command should fail because there was at least one upload error. skip_existing did not append to this list, but these skipped files were expected and should not be errors.

To fix this, I added a new slice that tracks files we've skipped separately. The end check now considers both uploaded and skipped files to determine if there was an error.

I've also included a new integration test that checks s3.put with skip_existing set to true.

@drichmdb drichmdb force-pushed the DEVPROD-13056 branch 3 times, most recently from 441e21a to 40ed804 Compare November 25, 2024 19:18
@drichmdb drichmdb requested a review from ablack12 November 25, 2024 20:52
Copy link
Contributor

@ablack12 ablack12 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM with a couple of nits (no need for me to look again unless you don't want to address, then feel free to DM so we can sync on it)

@@ -422,6 +424,12 @@ retryLoop:
return errors.Wrapf(err, "checking if file '%s' exists", remoteName)
}
if exists {
if s3pc.preservePath {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit -- since we aren't using this value anyway, can we just keep a skippedFilesCount instead? Then we don't need an if/else here. (Seems like we might be able to do that real quick for uploadedFiles too if you want to make them consistent, but I think it's fine either way)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can't make that change for uploadedFiles because they are used by the attachFiles function, but I could for skippedFiles quickly!

remoteFile := fmt.Sprintf("tests/%s/%s", t.Name(), id)

cmd := s3PutFactory()
params := map[string]any{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤯 didn't know about any

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's newish!

content, err := io.ReadAll(got)
require.NoError(t, err)

assert.Equal(t, payload, content)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add a little comment here, like "verify that file content wasn't overwritten by the second file"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done!

@drichmdb drichmdb force-pushed the DEVPROD-13056 branch 2 times, most recently from df22233 to ff5cff3 Compare November 25, 2024 21:19
Comment on lines +77 to +78
aws_key: $AWS_ACCESS_KEY_ID
aws_secret: $AWS_SECRET_ACCESS_KEY
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are these meant to be different than the ones used on line 54 and 55? I don't have time for a full review so I may be missing something.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure where those are coming from honestly -- I'm using the credentials we get from assume role at the beginning of the task!


if processedCount != len(filesList) && !s3pc.skipMissing {
logger.Task().Infof("Attempted to upload %d files, %d successfully uploaded.", len(filesList), processedCount)
return errors.Errorf("uploaded %d files of %d requested", processedCount, len(filesList))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this edit is wrong -- it should still be using length of uploaded files

This commit fixes s3.put's skip_existing parameter to not always fail
when a remote file already exists. At the end of s3.put, we check that
the length of uploadedFiles matches the length of fileList. If the
lengths aren't equal, the command should fail because there was at least
one upload error. skip_existing did not append to this list, but these
skipped files were expected and should not be errors.

To fix this, I added a new slice that tracks files we've skipped
separately. The end check now considers both uploaded and skipped files
to determine if there was an error.

I've also included a new integration test that checks s3.put with
skip_existing set to true.
@drichmdb drichmdb merged commit ce17a00 into evergreen-ci:main Nov 26, 2024
10 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants