athena-driver #3014

egor-ryashin · 2023-09-01T13:22:14Z

Checklist

Manual verification
Unit test coverage
E2E test coverage
Needs manual QA?

Summary

Issue addressed:

#3014

Details:

Adds Athena data source for data ingestion, see screenshots.

Steps to Verify

begelundmuller · 2023-09-04T11:50:55Z

runtime/drivers/athena/athena.go

+		{
+			Key:         "output.location",
+			DisplayName: "Output location",
+			Description: "Oputut location for query results in S3.",
+			Placeholder: "bucket-name",
+			Type:        drivers.StringPropertyType,
+			Required:    true,
+		},


I would suggest output_location. We've used dots before, but moving away from it (supposed to be a shorthand for nested fields only)

For the placeholder – it should be something like s3://bucket/path, right?

Nope, just a backet name.

begelundmuller · 2023-09-04T11:52:03Z

runtime/drivers/athena/athena.go

+		{
+			Key:         "profile.name",
+			DisplayName: "AWS profile",
+			Description: "AWS profile for credentials.",
+			Type:        drivers.StringPropertyType,
+			Required:    true,
+		},


This can be optional, right? Also, we don't support AWS profiles in the S3 connector right now, so wondering if we should remove it (always use the default one), and contemplate it in a follow-up PR.

Also, should this be in ConfigProperties and not SourceProperties?

Also, should this be in ConfigProperties and not SourceProperties?

There can be sources with different profiles (meanwhile profiles can be for the same AWS account).

begelundmuller · 2023-09-04T11:53:38Z

runtime/drivers/athena/athena.go

+type configProperties struct {
+	// SecretJSON      string `mapstructure:"google_application_credentials"`
+	// AllowHostAccess bool   `mapstructure:"allow_host_access"`
+}


I think we should support passing AWS access tokens directly here – see the S3 driver

begelundmuller · 2023-09-04T12:10:08Z

runtime/drivers/athena/athena.go

+// DownloadFiles returns a file iterator over objects stored in gcs.
+// The credential json is read from config google_application_credentials.
+// Additionally in case `allow_host_credentials` is true it looks for "Application Default Credentials" as well
+func (c *Connection) DownloadFiles(ctx context.Context, source *drivers.BucketSource) (drivers.FileIterator, error) {


Incorrect docstring

begelundmuller · 2023-09-04T12:19:04Z

runtime/drivers/athena/athena.go

+	prefix := "parquet_output_" + uuid.New().String()
+	bucketName := strings.TrimPrefix(strings.TrimRight(conf.OutputLocation, "/"), "s3://")
+	unloadPath := bucketName + "/" + prefix
+	err = c.unload(ctx, conf, "s3://"+unloadPath)
+	if err != nil {
+		return nil, fmt.Errorf("failed to unload: %w", err)
+	}


Why can't we use the output location outright? The Athena code samples don't appear to do any rewriting of the output path: https://docs.aws.amazon.com/athena/latest/ug/code-samples.html

If we need to rewrite the output location, use url.Parse and associated functions to safely edit the URL

Athena Go API and gocloud.dev/blob/s3blob demand different S3 location format.
s3blob needs only the bucket name:

s3bucket, err := s3blob.OpenBucketV2(context.TODO(), s3client, "athena-output-2820", nil)

and throws error InvalidBucketName: The specified bucket is not valid. when s3://athena-output-2820 passed.
While Athena SDK complains InvalidRequestException: Invalid location athena-output-20287 if s3 prefix is not specified:

executeParams := &athena.StartQueryExecutionInput{ QueryString: aws.String("UNLOAD (SELECT * FROM cat.ptable limit 10) TO '%s' WITH (format = 'PARQUET')", "athena-output-20287"), ResultConfiguration: resultConfig, }

And we need to decide what the user should pass: "s3://bucket-name" or "bucket-name", or the user has the luxury to specify both and expect the application to figure it out.
Right now the implementation allows all approaches but it requires additional parameter transformations.

runtime/drivers/athena/athena.go

begelundmuller · 2023-09-04T12:37:11Z

runtime/drivers/athena/athena.go

+	r := retrier.New(retrier.LimitedExponentialBackoff(20, 100*time.Millisecond, 1*time.Second), nil) // 100 200 400 800 1000 1000 1000 1000 1000 1000 ... < 20 sec
+
+	return r.Run(func() error {


Given the Athena latencies, I don't think a mixed exponential/linear strategy is needed. Can just keep it simple and do a loop with a sleep (will also make it easier to reason about how cancellation gets enforced). Or maybe a loop with a select that checks both the timer and ctx.Done() (to support faster cancellation)

begelundmuller · 2023-09-04T12:38:46Z

runtime/drivers/athena/athena.go

+	}
+
+	// Get Query execution and check for the Query state constantly every 2 second
+	executionID := *athenaExecution.QueryExecutionId


Redundant dereference (using &executionID in call to GetQueryExecutionInput

begelundmuller · 2023-09-04T12:39:33Z

runtime/drivers/athena/athena.go

+		status, stateErr := client.GetQueryExecution(ctx, &athena.GetQueryExecutionInput{
+			QueryExecutionId: &executionID,
+		})
+
+		if stateErr != nil {
+			return stateErr
+		}


Can just use the normal var name err – no apparent conflict in scope

runtime/drivers/athena/athena.go

begelundmuller · 2023-09-07T09:30:09Z

runtime/drivers/athena/athena.go

+		{
+			Key:         "output_location",
+			DisplayName: "S3 output location",
+			Description: "Oputut location for query results in S3.",
+			Placeholder: "mybucket",
+			Type:        drivers.StringPropertyType,
+			Required:    true,
+		},


The Athena docs uses a s3://bucket/path output location format, so we should support the same and use that as the placeholder

begelundmuller · 2023-09-07T09:30:39Z

runtime/drivers/athena/athena.go

+		{
+			Key:         "region",
+			DisplayName: "AWS region",
+			Description: "AWS profile for credentials.",
+			Type:        drivers.StringPropertyType,
+			Required:    true,
+		},


Description doesn't match

begelundmuller · 2023-09-07T15:50:30Z

runtime/drivers/athena/athena.go

+func (d driver) HasAnonymousSourceAccess(ctx context.Context, src drivers.Source, logger *zap.Logger) (bool, error) {
+	return false, fmt.Errorf("not implemented")
+}


Should return false, nil

begelundmuller · 2023-09-07T15:53:49Z

web-common/src/features/sources/modal/yupSchemas.ts

+          )
+          .required("Source name is required"),
+        output_location: yup.string().required(),
+        region: yup.string(),


It's marked required in the spec, but not here

begelundmuller · 2023-09-07T15:58:05Z

runtime/services/catalog/migrator/sources/sources.go

+	case "athena":
+		return &drivers.BucketSource{
+			Properties: props,
+		}, nil


See note about implementing as a DatabaseSource

begelundmuller · 2023-09-07T16:03:05Z

runtime/drivers/athena/athena.go

+	cfg, err := awsconfig.LoadDefaultConfig(
+		ctx,
+		awsconfig.WithRegion(conf.Region),
+		awsconfig.WithCredentialsProvider(credentials.NewStaticCredentialsProvider(c.config.AccessKeyID, c.config.SecretAccessKey, c.config.SessionToken)),
+	)


Does this work for environment credentials (in ~/.aws)? See the S3 connector – the expected behavior is: use access key if provided, else fallback to environment credentials unless AllowHostAccess is false.

begelundmuller · 2023-09-07T16:19:51Z

runtime/drivers/athena/athena.go

+	prefix := "parquet_output_" + uuid.New().String()
+	bucketName := strings.TrimPrefix(strings.TrimRight(conf.OutputLocation, "/"), "s3://")
+	unloadPath := bucketName + "/" + prefix
+	err = c.unload(ctx, cfg, conf, "s3://"+unloadPath)


See earlier note – it should take OutputLocation in the s3://bucket/path format, and then we can use url.Parse to parse it and obtain the bucket name for openBucket.

We should also make sure it supports nested output locations, as far as I can tell, it can't be assumed the bucket is dedicated to only Athena?

begelundmuller · 2023-09-07T16:24:56Z

runtime/drivers/athena/athena.go

+func (c *Connection) openBucket(ctx context.Context, conf *sourceProperties, bucket string) (*blob.Bucket, error) {
+	cfg, err := awsconfig.LoadDefaultConfig(
+		ctx,
+		awsconfig.WithRegion(conf.Region),
+		awsconfig.WithCredentialsProvider(credentials.NewStaticCredentialsProvider(c.config.AccessKeyID, c.config.SecretAccessKey, c.config.SessionToken)),
+	)
+	if err != nil {
+		return nil, err
+	}
+
+	s3client := s3v2.NewFromConfig(cfg)
+	return s3blob.OpenBucketV2(ctx, s3client, bucket, nil)
+}


This duplicates the awsconfig from DownloadFiles and may have the same credentials issues

begelundmuller · 2023-09-07T16:31:48Z

runtime/drivers/athena/athena.go

+	r := retrier.New(retrier.ConstantBackoff(20, 1*time.Second), nil)
+
+	return r.RunCtx(ctx, func(ctx context.Context) error {
+		status, err := client.GetQueryExecution(ctx, &athena.GetQueryExecutionInput{
+			QueryExecutionId: athenaExecution.QueryExecutionId,
+		})
+		if err != nil {
+			return err
+		}
+
+		state := status.QueryExecution.Status.State
+
+		if state == types.QueryExecutionStateSucceeded || state == types.QueryExecutionStateCancelled {
+			return nil
+		} else if state == types.QueryExecutionStateFailed {
+			return fmt.Errorf("Athena query execution failed %s", *status.QueryExecution.Status.AthenaError.ErrorMessage)
+		}
+		return fmt.Errorf("Execution is not completed yet, current state: %s", state)
+	})


This seems to translate to a 20 second query execution timeout? It's too little, it should probably continue to check until the ctx is cancelled. (It would also return a weird retry error message when hitting the 20s timeout.)

It seems it will continue to retry even if query execution failed?

I don't see why this warrants a third party library dependency instead of a for loop and simple select that checks a timer and context cancellation

Nope, see

func (c DefaultClassifier) Classify(err error) Action { if err == nil { return Succeed } return Retry }

We've already imported that lib and it provides an opinionated approach that one doesn't need to reinvent and it presents itself right away retrier, ConstantBackoff - the intention is more recognisable from those words. If I see a low level Timer then I need read more scrupulously to figure out if it's just a simple retry pattern or something else more complex.

begelundmuller · 2023-09-07T16:34:23Z

runtime/drivers/athena/athena.go

+	return err
+}
+
+func (c *Connection) DownloadFiles(ctx context.Context, source *drivers.BucketSource) (drivers.FileIterator, error) {


It seems it doesn't call cleanPath on success cases? It's probably easiest to wrap the file iterator with an object that calls cleanPath on close

begelundmuller · 2023-09-07T16:40:09Z

Another comment – please also add docs changes for the Athena connector. And @nishantmonu51 requested to let people know that having an S3 file retention rule for the output_location is a good idea, in case data is orphaned there (e.g. if Rill crashes).

begelundmuller

Two other questions:

It's missing the ability to configure the AWS region now?
What region will be used if the workgroup is nil? Is it guaranteed to be the output_location one?

begelundmuller · 2023-09-14T15:25:24Z

runtime/drivers/athena/athena.go

+		{
+			Key:         "athena_output_location",
+			DisplayName: "S3 output location",
+			Description: "Oputut location for query results in S3.",
+			Placeholder: "s3://bucket-name/path/",
+			Type:        drivers.StringPropertyType,
+			Required:    true,
+		},
+		{
+			Key:         "athena_workgroup",
+			DisplayName: "AWS Athena workgroup",
+			Description: "AWS Athena workgroup to use for queries.",
+			Type:        drivers.StringPropertyType,
+			Required:    false,
+		},


Can omit the athena_ prefix for the keys, it's implied when the code file starts with type: athena

begelundmuller · 2023-09-14T15:32:34Z

runtime/services/catalog/migrator/sources/sources.go

+	case "athena":
+		vars["aws_access_key_id"] = env["aws_access_key_id"]
+		vars["aws_secret_access_key"] = env["aws_secret_access_key"]
+		vars["aws_session_token"] = env["aws_session_token"]


Also needed in runtime/connections.go#connectorConfig

begelundmuller · 2023-09-14T15:35:27Z

runtime/drivers/duckdb/transporter/utils.go

+func sourceReader(paths []string, format string, ingestionProps map[string]any, fromAthena bool) (string, error) {
 	// Generate a "read" statement
 	if containsAny(format, []string{".csv", ".tsv", ".txt"}) {
 		// CSV reader
 		return generateReadCsvStatement(paths, ingestionProps)
-	} else if strings.Contains(format, ".parquet") {
+	} else if strings.Contains(format, ".parquet") || fromAthena {


This change shouldn't be needed, the sqlstore_to_duckdb transporter already sets the format to format := fileutil.FullExt(files[0])

Athena outputs parquet files without extension, and there's no configuration to change that. So the source reader doesn't have anything to detect parquet format here.

Partly changed this so that there is no fromAthena arg but .parquet extension is added to format

begelundmuller · 2023-09-14T15:41:33Z

runtime/drivers/athena/sql_store.go

+	// ie
+	// outputLocation s3://bucket-name/prefix
+	// unloadLocation s3://bucket-name/prefix/rill-connector-parquet-output-<uuid>
+	// unloadPath prefix/rill-connector-parquet-output-<uuid>
+	unloadFolderName := "parquet_output_" + uuid.New().String()
+	bucketName := strings.Split(strings.TrimPrefix(outputLocation, "s3://"), "/")[0]
+	unloadLocation := strings.TrimRight(outputLocation, "/") + "/" + unloadFolderName
+	unloadPath := strings.TrimPrefix(strings.TrimPrefix(unloadLocation, "s3://"+bucketName), "/")


Use url.Parse for URI manipulation

Change parquet_output_ to rill_tmp_

begelundmuller · 2023-09-14T15:49:37Z

runtime/drivers/athena/athena.go

+	} else if conf.WorkGroup != "" {
+		wo, err := client.GetWorkGroup(ctx, &athena.GetWorkGroupInput{
+			WorkGroup: aws.String(conf.WorkGroup),
+		})
+		if err != nil {
+			return "", err
+		}
+		return *wo.WorkGroup.Configuration.ResultConfiguration.OutputLocation, nil
+	}


Is a workgroup's output location guaranteed to be non-nil?

It's not guaranteed, but specifying a workgroup for the purpose of not specifying output location and having this workgroup without a location asks for the fail-fast approach.

begelundmuller · 2023-09-14T15:58:15Z

runtime/drivers/athena/athena.go

+	r := retrier.New(retrier.ConstantBackoff(int(5*time.Minute/time.Second), time.Second), nil) // 5 minutes timeout
+	return r.RunCtx(ctx, func(ctx context.Context) error {
+		status, err := client.GetQueryExecution(ctx, &athena.GetQueryExecutionInput{
+			QueryExecutionId: athenaExecution.QueryExecutionId,
+		})
+		if err != nil {
+			return err
+		}
+
+		state := status.QueryExecution.Status.State
+
+		if state == types.QueryExecutionStateSucceeded || state == types.QueryExecutionStateCancelled {
+			return nil
+		} else if state == types.QueryExecutionStateFailed {
+			return fmt.Errorf("Athena query execution failed %s", *status.QueryExecution.Status.AthenaError.ErrorMessage)
+		}
+		return fmt.Errorf("Athena ingestion timeout")
+	})


I still don't see how this will stop retrying on the query failed case. If it returns an error, it retries, right?

Also, is polling every second recommended? What's the recommended polling interval?

What's the recommended polling interval?

There's no recommendation.

It is 1 sec in code samples

begelundmuller · 2023-09-14T15:59:43Z

runtime/drivers/athena/athena.go

+
+func cleanPath(ctx context.Context, cfg aws.Config, bucketName, prefix string) error {


All the functions from here and after in this file are related to sql_store.go, so please move them there.

Also try to re-organize to follow the function ordering guidelines: https://github.com/uber-go/guide/blob/master/style.md#function-grouping-and-ordering

egor-ryashin · 2023-09-21T11:34:33Z

It's missing the ability to configure the AWS region now?

Yes, the region should be returned back. Right now, the region is resolved from the default aws configuration profile.

# Conflicts: # runtime/drivers/duckdb/transporter/sqlstore_to_duckDB.go # web-common/src/features/sources/modal/AddSourceModal.svelte

cleanUp function Added AWS region and reordered functions Moved functions to sql_store Renaming and code refactoring

# Conflicts: # runtime/services/catalog/artifacts/yaml/objects.go # web-common/src/features/sources/modal/yupSchemas.ts

esevastyanov · 2023-09-25T13:52:19Z

Yes, the region should be returned back. Right now, the region is resolved from the default aws configuration profile.

Returned the region, set its default value to us-east-1

esevastyanov · 2023-09-25T13:57:10Z

Is it guaranteed to be the output_location one?

output_location is optional, workgroup is optional.
Default value for a workgroup is primary (this is a default Athena workgroup that cannot be deleted).
A workgroup may have no output location set.
A workgroup may also have a feature that overrides a client-side output location by its own ~~so the logic ignores workgroup property if output_location is specified.~~
There are two locations: output and unload. Output location is used for metadata, while the unload is used for data. Even if a workgroup overrides the output location, the custom unload location may still be used.

begelundmuller · 2023-09-25T17:27:41Z

runtime/drivers/athena/sql_store.go

+	bucketObj, err := openBucket(ctx, awsConfig, bucketName)
+	if err != nil {
+		return nil, errors.Join(fmt.Errorf("cannot open bucket %q: %w", bucketName, err), cleanupFn())
+	}
+
+	opts := rillblob.Options{
+		GlobPattern: unloadPath + "/**",
+	}
+
+	it, err := rillblob.NewIterator(ctx, bucketObj, opts, c.logger)
+	if err != nil {
+		return nil, errors.Join(fmt.Errorf("cannot download parquet output %q %w", opts.GlobPattern, err), cleanupFn())
+	}


Should it call cleanupFn before returning in these error conditions?

Reduced the number of calls by moving the call into defer

begelundmuller · 2023-09-25T17:29:10Z

runtime/drivers/athena/sql_store.go

+	// outputLocation s3://bucket/path
+	// unloadLocation s3://bucket/path/rill_tmp_<uuid>
+	// unloadPath path/rill_tmp_<uuid>
+	unloadFolderName := "rill_tmp_" + uuid.New().String()


nit: Let's avoid mix underscores and dashes, always looks a bit weird. We could either use rill-tmp- instead or do strings.Replace(uuid.New().String(), "-", "") for the random characters

Used rill-tmp-

begelundmuller · 2023-09-25T17:33:20Z

runtime/drivers/athena/sql_store.go

+}
+
+func (c *Connection) unload(ctx context.Context, client *athena.Client, conf *sourceProperties, unloadLocation string) error {
+	finalSQL := fmt.Sprintf("UNLOAD (%s) TO '%s' WITH (format = 'PARQUET')", conf.SQL, unloadLocation)


We should put a newline after the injected SQL (UNLOAD (%s\n) ...) in case there's a comment at the end of it

begelundmuller · 2023-09-25T17:40:35Z

runtime/drivers/athena/sql_store.go

+	tm := time.NewTimer(5 * time.Minute)
+	defer tm.Stop()
+	for {
+		select {
+		case <-tm.C:
+			return fmt.Errorf("Athena ingestion timed out")
+		default:
+			status, err := client.GetQueryExecution(ctx, &athena.GetQueryExecutionInput{
+				QueryExecutionId: queryExecutionOutput.QueryExecutionId,
+			})
+			if err != nil {
+				return err
+			}
+
+			switch status.QueryExecution.Status.State {
+			case types2.QueryExecutionStateSucceeded, types2.QueryExecutionStateCancelled:
+				return nil
+			case types2.QueryExecutionStateFailed:
+				return fmt.Errorf("Athena query execution failed %s", *status.QueryExecution.Status.AthenaError.ErrorMessage)
+			}
+		}
+		time.Sleep(time.Second)
+	}


Instead of using a hard-coded timer, it can use the ctx (check <-ctx.Done()), which has a timeout that's configurable through the timeout: YAML property.

If there's a ctx cancellation/timeout, it would be nice to cancel the running query.

For case ..., types2.QueryExecutionStateCancelled – shouldn't this case return an error? Otherwise, it will try to consume the results of the cancelled query.

begelundmuller · 2023-09-25T17:42:30Z

runtime/drivers/athena/sql_store.go

+		if out.IsTruncated {
+			continuationToken = out.NextContinuationToken


Maybe also check out.NextContinuationToken != nil just to be extra cautious that we don't loop forever?

begelundmuller · 2023-09-25T17:47:09Z

runtime/drivers/duckdb/transporter/sqlstore_to_duckDB.go

+	fromAthena := reflect.TypeOf(s.from).AssignableTo(reflect.TypeOf(&athena.Connection{}))
 	for iter.HasNext() {
 		files, err := iter.NextBatch(_sqlStoreIteratorBatchSize)
 		if err != nil {
 			return err
 		}

 		format := fileutil.FullExt(files[0])
+		if fromAthena {
+			// Athena doesn't specify ".parquet" extension in output file names
+			// Append ".parquet" extension to the extension generated by Athena
+			format += ".parquet"
+		}


This reflection really hurts... Is there not a way to have Athena give output files the right file extension with UNLOAD?

If not, we should add a way to propagate the format, maybe by adding it in rillblob.Options and exposing it as iter.Format() or something like that.

# Conflicts: # runtime/drivers/blob/blobdownloader.go

begelundmuller

Looks solid!

Egor Ryashin added 2 commits September 1, 2023 12:07

athena-driver

2ee73b7

athena-driver

328f2ae

egor-ryashin marked this pull request as ready for review September 1, 2023 13:26

egor-ryashin requested review from begelundmuller and nishantmonu51 September 1, 2023 13:27

Egor Ryashin added 2 commits September 4, 2023 12:22

athena-driver

a047f82

athena-driver

dbba228

begelundmuller requested changes Sep 4, 2023

View reviewed changes

Egor Ryashin added 11 commits September 5, 2023 16:01

athena-driver review

985f195

athena-driver review

3db7655

athena-driver review

5c66737

athena-driver review

605b791

athena-driver review

2008155

athena-driver review

00d3ec0

athena-driver review

dc52056

athena-driver review

d7e774b

Merge remote-tracking branch 'origin/main' into athena-connector

e68b2af

Merge remote-tracking branch 'origin/main' into athena-connector

29b816e

Merge remote-tracking branch 'origin/main' into athena-connector

e5f1794

egor-ryashin requested a review from begelundmuller September 7, 2023 09:05

Run go mod tidy

549cf67

begelundmuller requested changes Sep 7, 2023

View reviewed changes

Egor Ryashin added 7 commits September 11, 2023 19:38

athena-driver review

e040606

athena-driver review

804f03e

athena-driver review

844877f

athena-driver review

f0bbee9

Merge remote-tracking branch 'origin/main' into athena-connector

8d4b69b

Merge remote-tracking branch 'origin/main' into athena-connector

03d5c42

Merge remote-tracking branch 'origin/main' into athena-connector

a5a5146

Merge remote-tracking branch 'origin/main' into athena-connector

45efd9a

egor-ryashin requested a review from begelundmuller September 14, 2023 11:40

begelundmuller requested changes Sep 14, 2023

View reviewed changes

athena-driver review

b594ed4

esevastyanov added 3 commits September 22, 2023 23:44

Merge remote-tracking branch 'origin/main' into athena-connector

7ab90bc

# Conflicts: # runtime/drivers/duckdb/transporter/sqlstore_to_duckDB.go # web-common/src/features/sources/modal/AddSourceModal.svelte

Auto-determine AWS region

bf2f044

cleanUp function Added AWS region and reordered functions Moved functions to sql_store Renaming and code refactoring

Athena icon

28f3e6e

nishantmonu51 added the blocker A release blocker issue that should be resolved before a new release label Sep 25, 2023

esevastyanov added 3 commits September 25, 2023 16:27

Removed the auto-resolving of AWS region

18b423d

Simplified a clean-up process

679e85d

Merge remote-tracking branch 'origin/main' into athena-connector

7e320ac

# Conflicts: # runtime/services/catalog/artifacts/yaml/objects.go # web-common/src/features/sources/modal/yupSchemas.ts

esevastyanov requested a review from begelundmuller September 25, 2023 13:57

Updated according to previously merged changes

3ab245d

begelundmuller requested changes Sep 25, 2023

View reviewed changes

esevastyanov added 9 commits September 25, 2023 22:07

Dash vs underscore

804de57

A new line after a query

591a4ef

Non-nil NextContinuationToken

bc7f95d

ctx cancellation instead of a hardcoded timer

af69414

Format for FileIterator

0bd4d25

deferred cleanupFn()

a0d18da

Aligned Athena query with a source config

a1d0eb9

Merge remote-tracking branch 'origin/main' into athena-connector

6d72df9

# Conflicts: # runtime/drivers/blob/blobdownloader.go

Fixed a merge conflict

480aa95

esevastyanov requested a review from begelundmuller September 25, 2023 20:55

begelundmuller approved these changes Sep 26, 2023

View reviewed changes

begelundmuller merged commit 5b56565 into main Sep 26, 2023
4 checks passed

begelundmuller deleted the athena-connector branch September 26, 2023 09:28

		r := retrier.New(retrier.LimitedExponentialBackoff(20, 100time.Millisecond, 1time.Second), nil) // 100 200 400 800 1000 1000 1000 1000 1000 1000 ... < 20 sec

		return r.Run(func() error {


		func cleanPath(ctx context.Context, cfg aws.Config, bucketName, prefix string) error {

		if out.IsTruncated {
		continuationToken = out.NextContinuationToken

athena-driver #3014

athena-driver #3014

Conversation

egor-ryashin commented Sep 1, 2023 • edited by begelundmuller Loading

Checklist

Summary

Issue addressed:

Details:

Steps to Verify

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

egor-ryashin Sep 5, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

egor-ryashin Sep 11, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

begelundmuller commented Sep 7, 2023

begelundmuller left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

egor-ryashin Sep 21, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

egor-ryashin commented Sep 21, 2023

esevastyanov commented Sep 25, 2023

esevastyanov commented Sep 25, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

begelundmuller left a comment

Choose a reason for hiding this comment

egor-ryashin commented Sep 1, 2023 •

edited by begelundmuller

Loading

egor-ryashin Sep 5, 2023 •

edited

Loading

egor-ryashin Sep 11, 2023 •

edited

Loading

egor-ryashin Sep 21, 2023 •

edited

Loading

esevastyanov commented Sep 25, 2023 •

edited

Loading