-
Notifications
You must be signed in to change notification settings - Fork 182
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
allow additional arguments to be passed to fsspec via FilesystemConfiguration #814
Comments
Signed-off-by: Marcel Coetzee <[email protected]>
Signed-off-by: Marcel Coetzee <[email protected]>
Signed-off-by: Marcel Coetzee <[email protected]>
@rudolfix Do we have a specific cert we want to use in our test suite? Does CI have some certificate installed I can use? |
@Pipboyguy yeah I can create another account on AWS that requires certificate to connect. but maybe we could try this: generate a mock certificate, use it to connect and make sure that we get relevant SSL error which means that certificate is present WDYT? |
With a self signed cert I guess we'd need a MinIO deployment? If we go with S3 we'd need CloudFront to handle the TLS connection.
|
I'll start with a local MinIO container until further guidance. It's testing it in CI that I'm scratching my head over |
Following this issue as I think it will help with implementing git/github fsspec dlt-hub/verified-sources#301 . The initial case there is to pass
|
Signed-off-by: Marcel Coetzee <[email protected]>
Signed-off-by: Marcel Coetzee <[email protected]>
Signed-off-by: Marcel Coetzee <[email protected]>
Signed-off-by: Marcel Coetzee <[email protected]>
Signed-off-by: Marcel Coetzee <[email protected]>
This default was passed to the filesystem factory each time:
I kept it for now, but I wonder whether this shouldn't be an additional argument passed by the user? Should it remain a default configuration? |
I looked at the repos at https://github.com/fsspec . It looks like many fsspec implementations - including s3fs, gdrivefs, ftp - do cache the listings (in a DirCache) and they accept fsspec's default of I'm pretty new to dlt so I don't know why it is hard-coded to False. My vote would be to leave the dlt default as False, but give user option to set to True,
|
@Pipboyguy Is it in your scope to support the additional arguments in the |
@deanja the reason to get rid of the cache is that most of I also say explicitly that I do not want the cache when doing |
I agree. Caches are mostly frightening!
|
@deanja One could argue its in this scope. This would be trivial: def fsspec_filesystem(
protocol: str,
credentials: FileSystemCredentials = None,
kwargs: Optional[DictStrAny] = None,
client_kwargs: Optional[DictStrAny] = None,
) -> Tuple[AbstractFileSystem, str]:
return fsspec_from_config(
FilesystemConfiguration(
protocol, credentials, kwargs=kwargs, client_kwargs=client_kwargs
)
) |
Signed-off-by: Marcel Coetzee <[email protected]>
) Refactored common storage configuration and revised imports in line with PEP rules. Also updated the string formatting in filesystem related tests and moved `test_s3_wrong_certificate` method to `test_filesystem_client.py`. There’s an addition of `kwargs` and `client_kwargs` parameters to `fsspec_filesystem` function allowing for a more configurable function. Signed-off-by: Marcel Coetzee <[email protected]>
Signed-off-by: Marcel Coetzee <[email protected]>
Signed-off-by: Marcel Coetzee <[email protected]>
Signed-off-by: Marcel Coetzee <[email protected]>
Signed-off-by: Marcel Coetzee <[email protected]>
Updated the test case for S3 SSL certificate verification to use a self-signed certificate generated during the test. This change ensures the test is more reliable, as it doesn't depend on external, possibly changing, data. Improved the test setup by creating and using a temporary certificate file, which gets cleaned up after use. Signed-off-by: Marcel Coetzee <[email protected]>
Signed-off-by: Marcel Coetzee <[email protected]>
Signed-off-by: Marcel Coetzee <[email protected]>
Signed-off-by: Marcel Coetzee <[email protected]>
Signed-off-by: Marcel Coetzee <[email protected]>
Signed-off-by: Marcel Coetzee <[email protected]>
Signed-off-by: Marcel Coetzee <[email protected]>
Signed-off-by: Marcel Coetzee <[email protected]>
Signed-off-by: Marcel Coetzee <[email protected]>
Signed-off-by: Marcel Coetzee <[email protected]>
Updated the filesystem.md documentation to improve clarity, fix grammar mistakes, and ensure better readability. The changes include breaking up longer sentences, correcting spellings and punctuations. (#814) Signed-off-by: Marcel Coetzee <[email protected]>
…rguments (#814) Signed-off-by: Marcel Coetzee <[email protected]>
Signed-off-by: Marcel Coetzee <[email protected]>
Signed-off-by: Marcel Coetzee <[email protected]>
Signed-off-by: Marcel Coetzee <[email protected]>
This commit updates the handling of filesystem and credential configuration for different cloud storage protocols. Signed-off-by: Marcel Coetzee <[email protected]>
Signed-off-by: Marcel Coetzee <[email protected]>
Signed-off-by: Marcel Coetzee <[email protected]>
Signed-off-by: Marcel Coetzee <[email protected]>
Signed-off-by: Marcel Coetzee <[email protected]>
Background
Our users should be able to pass additional arguments to fsspec filesystems when they are created. A good use case is to enable custom SSL checks on s3:
fsspec/s3fs#342
requires additional argument
use_ssl
and additional element inclient_kwargs
(`verify')Tasks
FilesystemConfiguration
so it accepts additional arguments and client arguments ie. via config fields accepting dictionariesfsspec_from_config
properlyfilesystem
to accept those additional arguments nicely, write docstrings etc.devel
branchThe text was updated successfully, but these errors were encountered: