-
Notifications
You must be signed in to change notification settings - Fork 91
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
build(datasets): upgrade s3fs to newer calver #348
build(datasets): upgrade s3fs to newer calver #348
Conversation
Signed-off-by: Matthias Roels <[email protected]>
Since s3fs is switched to aiobotocore, tests started to fail with message `'MockRawResponse' object has no attribute 'raw_headers'`. This is a bug in aiobotocore. Using moto server does seem to resolve the issues Signed-off-by: Matthias Roels <[email protected]>
Signed-off-by: Matthias Roels <[email protected]>
It's interesting that you need moto-server to resolve the issue. Did you get some weird authentication error? This is something I encountered in the past, with newer moto version the issue seems to resolve itself automatically. Does it fails on all version on specific Python Version? |
@noklam: No I didn't encounter any weird authn errors. If you don't use moto in server mode, you always run into these kind of errors:
For some reason, it occurs only in some test files. The tests for As a side note: I noticed there is a lot of inconsistency in the types of tests between different dataset classes. Is that intentional? |
@MatthiasRoels that's fine, I have research this in the past I think server mode is the right way. |
Signed-off-by: Matthias Roels <[email protected]>
Signed-off-by: Matthias Roels <[email protected]>
Signed-off-by: Matthias Roels <[email protected]>
Key=S3_KEY_PATH + "/" + video_name, | ||
Filename=str(tmp_file), | ||
) | ||
with open(tmp_file, "wb") as file: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: we could do tmp_file.open("wb")
since it's already a pathlib object
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't knew that was possible :-). Do you want me to change it?
def credentials(): | ||
return { | ||
"key": "fake_access_key", | ||
"secret": "fake_secret_key", | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why there are 2 sets of credentials?
Do we need key
and secret
or the AWS_XXXXX
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One is for credentials file, the other one are env vars. The env vars are required for bucket setup etc while the credentials are used in the Dataset classes.
Signed-off-by: Matthias Roels <[email protected]>
Seems like the unit tests failed on (windows-latest, 3.8). Unfortunately, I have no access to a Windows machine for debugging and testing. And to be honest, I have never written software on a Windows OS before... So it would be nice if someone can help here. Thanks! |
Signed-off-by: Matthias Roels <[email protected]> Co-authored-by: Nok Lam Chan <[email protected]>
@MatthiasRoels I've tried running this on Gitpod (ubuntu) and the tests also fail there, so it's not just a windows problem. You can also see that in the builds: https://github.com/kedro-org/kedro-plugins/actions/runs/6557080300/job/17809502926?pr=348 I guess it was just the windows 3.8 that failed first. |
Update branch to see if the problem persist |
Signed-off-by: Merel Theisen <[email protected]>
Signed-off-by: Merel Theisen <[email protected]>
Signed-off-by: Merel Theisen <[email protected]>
Signed-off-by: Merel Theisen <[email protected]>
…k tests Signed-off-by: Merel Theisen <[email protected]>
Signed-off-by: Merel Theisen <[email protected]>
Signed-off-by: Merel Theisen <[email protected]>
Signed-off-by: Merel Theisen <[email protected]>
…com:MatthiasRoels/kedro-plugins into build/datasets-upgrade-s3fs-to-newer-calver
Signed-off-by: Deepyaman Datta <[email protected]>
750f31a
to
402439e
Compare
Thanks @merelcht for taking over, I wanted to work on this for a while, but I’m in the middle of a cloud migration so I’m way to busy 🙁 |
Hi @MatthiasRoels no worries. I've been trying to fix the issues described by people in aio-libs/aiobotocore#755. It's not straightforward though, so I'm using another branch to experiment and not pollute this PR too much. |
@MatthiasRoels I opened #463 where I've got most tests passing now and marked the two failing ones with |
Thanks again for opening this PR @MatthiasRoels #463 is now merged, so I'll close this PR. After our |
Description
s3fs version is already 3+ years old. The aim of this PR is to upgrade it to their new calver versioning approach.
Fixes #337
Development notes
When s3fs switched to calver versioning, they switched from using
botocore
toaiobotocore
, which is an async client for AWS services. As a result it became much more difficult to usemoto
for unit testing as many people faced issues like the one described here aio-libs/aiobotocore#755. To fix it, we had to switch to using moto in server mode.Checklist
RELEASE.md
file