Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Access to public S3 buckets without credentials #287

Merged
merged 10 commits into from
Nov 10, 2023
Merged
Show file tree
Hide file tree
Changes from 8 commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions doc/changes/changelog.md

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

11 changes: 11 additions & 0 deletions doc/changes/changes_2.7.8.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
# Cloud Storage Extension 2.7.8, released 2023-11-10

Code name: Access to public S3 buckets without credentials

## Summary

Implemented an option to access public S3 buckets without credentials.

## Features

* #283: Support publicly available S3 buckets without credentials
24 changes: 14 additions & 10 deletions doc/user_guide/user_guide.md
Original file line number Diff line number Diff line change
Expand Up @@ -150,7 +150,7 @@ downloaded jar file is the same as the checksum provided in the releases.
To check the SHA256 result of the local jar, run the command:

```sh
sha256sum exasol-cloud-storage-extension-2.7.7.jar
sha256sum exasol-cloud-storage-extension-2.7.8.jar
```

### Building From Source
Expand Down Expand Up @@ -180,7 +180,7 @@ mvn clean package -DskipTests=true
```

The assembled jar file should be located at
`target/exasol-cloud-storage-extension-2.7.7.jar`.
`target/exasol-cloud-storage-extension-2.7.8.jar`.

### Create an Exasol Bucket

Expand All @@ -202,7 +202,7 @@ for the HTTP protocol.
Upload the jar file using curl command:

```sh
curl -X PUT -T exasol-cloud-storage-extension-2.7.7.jar \
curl -X PUT -T exasol-cloud-storage-extension-2.7.8.jar \
http://w:<WRITE_PASSWORD>@exasol.datanode.domain.com:2580/<BUCKET>/
```

Expand Down Expand Up @@ -234,7 +234,7 @@ OPEN SCHEMA CLOUD_STORAGE_EXTENSION;

CREATE OR REPLACE JAVA SET SCRIPT IMPORT_PATH(...) EMITS (...) AS
%scriptclass com.exasol.cloudetl.scriptclasses.FilesImportQueryGenerator;
%jar /buckets/bfsdefault/<BUCKET>/exasol-cloud-storage-extension-2.7.7.jar;
%jar /buckets/bfsdefault/<BUCKET>/exasol-cloud-storage-extension-2.7.8.jar;
/

CREATE OR REPLACE JAVA SCALAR SCRIPT IMPORT_METADATA(...) EMITS (
Expand All @@ -244,12 +244,12 @@ CREATE OR REPLACE JAVA SCALAR SCRIPT IMPORT_METADATA(...) EMITS (
end_index DECIMAL(36, 0)
) AS
%scriptclass com.exasol.cloudetl.scriptclasses.FilesMetadataReader;
%jar /buckets/bfsdefault/<BUCKET>/exasol-cloud-storage-extension-2.7.7.jar;
%jar /buckets/bfsdefault/<BUCKET>/exasol-cloud-storage-extension-2.7.8.jar;
/

CREATE OR REPLACE JAVA SET SCRIPT IMPORT_FILES(...) EMITS (...) AS
%scriptclass com.exasol.cloudetl.scriptclasses.FilesDataImporter;
%jar /buckets/bfsdefault/<BUCKET>/exasol-cloud-storage-extension-2.7.7.jar;
%jar /buckets/bfsdefault/<BUCKET>/exasol-cloud-storage-extension-2.7.8.jar;
/
```

Expand All @@ -268,12 +268,12 @@ OPEN SCHEMA CLOUD_STORAGE_EXTENSION;

CREATE OR REPLACE JAVA SET SCRIPT EXPORT_PATH(...) EMITS (...) AS
%scriptclass com.exasol.cloudetl.scriptclasses.TableExportQueryGenerator;
%jar /buckets/bfsdefault/<BUCKET>/exasol-cloud-storage-extension-2.7.7.jar;
%jar /buckets/bfsdefault/<BUCKET>/exasol-cloud-storage-extension-2.7.8.jar;
/

CREATE OR REPLACE JAVA SET SCRIPT EXPORT_TABLE(...) EMITS (ROWS_AFFECTED INT) AS
%scriptclass com.exasol.cloudetl.scriptclasses.TableDataExporter;
%jar /buckets/bfsdefault/<BUCKET>/exasol-cloud-storage-extension-2.7.7.jar;
%jar /buckets/bfsdefault/<BUCKET>/exasol-cloud-storage-extension-2.7.8.jar;
/
```

Expand Down Expand Up @@ -407,13 +407,13 @@ CREATE OR REPLACE JAVA SCALAR SCRIPT IMPORT_METADATA(...) EMITS (
) AS
%jvmoption -DHTTPS_PROXY=http://username:[email protected]:1180
%scriptclass com.exasol.cloudetl.scriptclasses.FilesMetadataReader;
%jar /buckets/bfsdefault/<BUCKET>/exasol-cloud-storage-extension-2.7.7.jar;
%jar /buckets/bfsdefault/<BUCKET>/exasol-cloud-storage-extension-2.7.8.jar;
/

CREATE OR REPLACE JAVA SET SCRIPT IMPORT_FILES(...) EMITS (...) AS
%jvmoption -DHTTPS_PROXY=http://username:[email protected]:1180
%scriptclass com.exasol.cloudetl.scriptclasses.FilesDataImporter;
%jar /buckets/bfsdefault/<BUCKET>/exasol-cloud-storage-extension-2.7.7.jar;
%jar /buckets/bfsdefault/<BUCKET>/exasol-cloud-storage-extension-2.7.8.jar;
/
```

Expand Down Expand Up @@ -722,6 +722,10 @@ S3_SESSION_TOKEN
Please follow the [Amazon credentials management best practices][aws-creds] when
creating credentials.

If you are accessing the public bucket, you don't need credentials. In such case,
Shmuma marked this conversation as resolved.
Show resolved Hide resolved
you need to set `S3_ACCESS_KEY` and `S3_SECRET_KEY` to empty values:
`S3_ACCESS_KEY=;S3_SECRET_KEY=`.

[aws-creds]: https://docs.aws.amazon.com/general/latest/gr/aws-sec-cred-types.html

Another required parameter is the S3 endpoint, `S3_ENDPOINT`. An endpoint is the
Expand Down
2 changes: 1 addition & 1 deletion pk_generated_parent.pom

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

4 changes: 2 additions & 2 deletions pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -3,14 +3,14 @@
<modelVersion>4.0.0</modelVersion>
<groupId>com.exasol</groupId>
<artifactId>cloud-storage-extension</artifactId>
<version>2.7.7</version>
<version>2.7.8</version>
<name>Cloud Storage Extension</name>
<description>Exasol Cloud Storage Import And Export Extension</description>
<url>https://github.com/exasol/cloud-storage-extension/</url>
<parent>
<artifactId>cloud-storage-extension-generated-parent</artifactId>
<groupId>com.exasol</groupId>
<version>2.7.7</version>
<version>2.7.8</version>
<relativePath>pk_generated_parent.pom</relativePath>
</parent>
<properties>
Expand Down
22 changes: 16 additions & 6 deletions src/main/scala/com/exasol/cloudetl/bucket/S3Bucket.scala
Original file line number Diff line number Diff line change
Expand Up @@ -52,6 +52,9 @@ final case class S3Bucket(path: String, params: StorageProperties) extends Bucke
)
}

private[this] def isAnonymousAWSParams(properties: StorageProperties): Boolean =
properties.getString(S3_ACCESS_KEY).isEmpty && properties.getString(S3_SECRET_KEY).isEmpty

/**
* @inheritdoc
*
Expand Down Expand Up @@ -83,15 +86,22 @@ final case class S3Bucket(path: String, params: StorageProperties) extends Bucke
properties
}

conf.set("fs.s3a.access.key", mergedProperties.getString(S3_ACCESS_KEY))
conf.set("fs.s3a.secret.key", mergedProperties.getString(S3_SECRET_KEY))

if (mergedProperties.containsKey(S3_SESSION_TOKEN)) {
if (isAnonymousAWSParams(mergedProperties)) {
conf.set(
"fs.s3a.aws.credentials.provider",
classOf[org.apache.hadoop.fs.s3a.TemporaryAWSCredentialsProvider].getName()
classOf[org.apache.hadoop.fs.s3a.AnonymousAWSCredentialsProvider].getName()
)
conf.set("fs.s3a.session.token", mergedProperties.getString(S3_SESSION_TOKEN))
} else {
conf.set("fs.s3a.access.key", mergedProperties.getString(S3_ACCESS_KEY))
conf.set("fs.s3a.secret.key", mergedProperties.getString(S3_SECRET_KEY))

if (mergedProperties.containsKey(S3_SESSION_TOKEN)) {
conf.set(
"fs.s3a.aws.credentials.provider",
classOf[org.apache.hadoop.fs.s3a.TemporaryAWSCredentialsProvider].getName()
)
conf.set("fs.s3a.session.token", mergedProperties.getString(S3_SESSION_TOKEN))
}
}

properties.getProxyHost().foreach { proxyHost =>
Expand Down
9 changes: 9 additions & 0 deletions src/test/scala/com/exasol/cloudetl/bucket/S3BucketTest.scala
Original file line number Diff line number Diff line change
Expand Up @@ -62,6 +62,15 @@ class S3BucketTest extends AbstractBucketTest {
assertConfigurationProperties(bucket, configMappings - "fs.s3a.session.token")
}

test(testName = "apply return specific credentials provider for public access configuration") {
Shmuma marked this conversation as resolved.
Show resolved Hide resolved
val exaMetadata = mockConnectionInfo("access", "S3_ACCESS_KEY=;S3_SECRET_KEY=")
val bucket = getBucket(defaultProperties, exaMetadata)
assert(
bucket.getConfiguration().get("fs.s3a.aws.credentials.provider") ==
"org.apache.hadoop.fs.s3a.AnonymousAWSCredentialsProvider"
)
}

test("apply returns S3Bucket with secret and session token from connection") {
val exaMetadata = mockConnectionInfo("access", "S3_SECRET_KEY=secret;S3_SESSION_TOKEN=token")
val bucket = getBucket(defaultProperties, exaMetadata)
Expand Down