Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DROP TABLE with PURGE does not delete metadata.json files #289

Closed
1 task done
loicalleyne opened this issue Sep 12, 2024 · 15 comments · Fixed by #312
Closed
1 task done

DROP TABLE with PURGE does not delete metadata.json files #289

loicalleyne opened this issue Sep 12, 2024 · 15 comments · Fixed by #312
Labels
bug Something isn't working good first issue Good for newcomers

Comments

@loicalleyne
Copy link

Is this a possible security vulnerability?

  • This is NOT a possible security vulnerability

Describe the bug

When dropping a table, the data folder is deleted but the metadata folder remains with the metadata.json files it contains.

To Reproduce

Using postgres as the metadata store, and GCS for storage.

The queries below (executed in Trino) create a schema/namespace in the catalog, creates a table from some sample data in BigQuery(the data can be from anywhere really), copies the data to a second table, then drops the second table.

CREATE SCHEMA IF NOT EXISTS polarisgcs.test_schema1 WITH (location = 'gs://bucket-iceberg-polaris/polaris/test_schema1');

CREATE TABLE IF NOT EXISTS polarisgcs.test_schema1.test1 WITH (format='PARQUET', partitioning = ARRAY['domain_id', 

'day(date)']) AS SELECT date, domain_id, country, resource_id, metric1 FROM bq.sample.table1;

CREATE TABLE IF NOT EXISTS polarisgcs.test_schema1.test2 WITH (format='PARQUET', partitioning = ARRAY['domain_id', 'day(date)']) AS SELECT date, domain_id, country, resource_id, metric1 FROM polarisgcs.test_schema1.test1;

DROP TABLE polarisgcs.test_schema1.test2;

Actual Behavior

files in {table name}/metadata are not deleted

>gsutil ls gs://bucket-iceberg-polaris/polaris/test_schema1/test2*/**
gs://bucket-iceberg-polaris/polaris/test_schema1/test2-139045f9cbde4d1f9ef643141621a208/metadata/00000-af991efc-4628-4703-95c4-338a721eada7.metadata.json
gs://bucket-iceberg-polaris/polaris/test_schema1/test2-139045f9cbde4d1f9ef643141621a208/metadata/20240912_202545_00006_ft2jx-d05107ef-d9c4-4021-a450-2ecb4c796c2f.stats
gs://bucket-iceberg-polaris/polaris/test_schema1/test2-139045f9cbde4d1f9ef643141621a208/metadata/snap-789093043791680477-1-f9449a04-12db-4360-a6fe-f577830d075a.avro

Expected Behavior

All files in the dropped table's path are deleted.

Additional context

postgres as backing metadata store
GCS storage

{"timestamp":1726172836631,"level":"INFO","thread":"pool-3-thread-58 - DELETE /api/catalog/v1/polaris/namespaces/test_schema1/tables/test2?purgeRequested=true","logger":"org.apache.polaris.service.auth.TestInlineBearerTokenPolarisAuthenticator","message":"Checking for existence of principal polarisTester in map {principal=polarisTester, realm=polaris_iceberg}","mdc":{"spanId":"0da7f8369a5be134","traceId":"ad635ef2168e9224e25bcb4cfeb587b6","realm":"polaris_iceberg","request_id":null},"params":{}}
{"timestamp":1726172836631,"level":"WARN","thread":"pool-3-thread-58 - DELETE /api/catalog/v1/polaris/namespaces/test_schema1/tables/test2?purgeRequested=true","logger":"org.apache.polaris.service.auth.TestInlineBearerTokenPolarisAuthenticator","message":"Failed to load secrets for principal polarisTester","mdc":{"spanId":"0da7f8369a5be134","traceId":"ad635ef2168e9224e25bcb4cfeb587b6","realm":"polaris_iceberg","request_id":null},"params":{}}
{"timestamp":1726172836631,"level":"DEBUG","thread":"pool-3-thread-58 - DELETE /api/catalog/v1/polaris/namespaces/test_schema1/tables/test2?purgeRequested=true","logger":"org.apache.polaris.service.auth.BasePolarisAuthenticator","message":"Resolving principal for tokenInfo client_id=null","mdc":{"spanId":"0da7f8369a5be134","traceId":"ad635ef2168e9224e25bcb4cfeb587b6","realm":"polaris_iceberg","request_id":null},"params":{}}
{"timestamp":1726172836631,"level":"DEBUG","thread":"pool-3-thread-58 - DELETE /api/catalog/v1/polaris/namespaces/test_schema1/tables/test2?purgeRequested=true","logger":"org.apache.polaris.service.config.RealmEntityManagerFactory","message":"Looking up PolarisEntityManager for realm polaris_iceberg","mdc":{"spanId":"0da7f8369a5be134","traceId":"ad635ef2168e9224e25bcb4cfeb587b6","realm":"polaris_iceberg","request_id":null},"params":{}}
{"timestamp":1726172836631,"level":"DEBUG","thread":"pool-3-thread-58 - DELETE /api/catalog/v1/polaris/namespaces/test_schema1/tables/test2?purgeRequested=true","logger":"org.apache.polaris.service.auth.BasePolarisAuthenticator","message":"Resolved principal: name=polarisTester;id=7;parentId=0;entityVersion=1;type=PRINCIPAL;subType=NULL_SUBTYPE;internalProperties={client_id=15454eed974a106b}","mdc":{"spanId":"0da7f8369a5be134","traceId":"ad635ef2168e9224e25bcb4cfeb587b6","realm":"polaris_iceberg","request_id":null},"params":{}}
{"timestamp":1726172836631,"level":"DEBUG","thread":"pool-3-thread-58 - DELETE /api/catalog/v1/polaris/namespaces/test_schema1/tables/test2?purgeRequested=true","logger":"org.apache.polaris.service.auth.BasePolarisAuthenticator","message":"Populating authenticatedPrincipal into CallContext: principalEntity=name=polarisTester;id=7;parentId=0;entityVersion=1;type=PRINCIPAL;subType=NULL_SUBTYPE;internalProperties={client_id=15454eed974a106b};activatedPrincipalRoleNames=[];activatedPrincipalRoles=null","mdc":{"spanId":"0da7f8369a5be134","traceId":"ad635ef2168e9224e25bcb4cfeb587b6","realm":"polaris_iceberg","request_id":null},"params":{}}
{"timestamp":1726172836633,"level":"DEBUG","thread":"pool-3-thread-58 - DELETE /api/catalog/v1/polaris/namespaces/test_schema1/tables/test2?purgeRequested=true","logger":"org.apache.polaris.service.catalog.api.IcebergRestCatalogApi","message":"Invoking CatalogApi with params","mdc":{"spanId":"0da7f8369a5be134","traceId":"ad635ef2168e9224e25bcb4cfeb587b6","realm":"polaris_iceberg","request_id":null},"params":{"purgeRequested":true,"prefix":"polaris","namespace":"test_schema1","operation":"dropTable","table":"test2"}}
{"timestamp":1726172836633,"level":"DEBUG","thread":"pool-3-thread-58 - DELETE /api/catalog/v1/polaris/namespaces/test_schema1/tables/test2?purgeRequested=true","logger":"org.apache.polaris.service.config.RealmEntityManagerFactory","message":"Looking up PolarisEntityManager for realm polaris_iceberg","mdc":{"spanId":"0da7f8369a5be134","traceId":"ad635ef2168e9224e25bcb4cfeb587b6","realm":"polaris_iceberg","request_id":null},"params":{}}
{"timestamp":1726172836633,"level":"DEBUG","thread":"pool-3-thread-58 - DELETE /api/catalog/v1/polaris/namespaces/test_schema1/tables/test2?purgeRequested=true","logger":"org.apache.polaris.core.auth.PolarisAuthorizer","message":"Satisfied privilege TABLE_DROP with grantRecord PolarisGrantRec{securableCatalogId=0, securableId=11, granteeCatalogId=11, granteeId=12, privilegeCode=31} from securable entity:name=polaris;id=11;parentId=0;entityVersion=1;type=CATALOG;subType=NULL_SUBTYPE;internalProperties={catalogType=INTERNAL, storage_configuration_info={\"@type\":\"GcpStorageConfigurationInfo\",\"allowedLocations\":[\"gs://bucket-iceberg-polaris/polaris\"],\"storageType\":\"GCS\",\"fileIoImplClassName\":\"org.apache.iceberg.gcp.gcs.GCSFileIO\"}};grantRecordsAsGrantee:[];grantRecordsAsSecurable:[PolarisGrantRec{securableCatalogId=0, securableId=11, granteeCatalogId=11, granteeId=12, privilegeCode=2}, PolarisGrantRec{securableCatalogId=0, securableId=11, granteeCatalogId=11, granteeId=12, privilegeCode=31}, PolarisGrantRec{securableCatalogId=0, securableId=11, granteeCatalogId=11, granteeId=12, privilegeCode=32}] for principalName polarisTester and activatedIds [2, 12]","mdc":{"spanId":"0da7f8369a5be134","traceId":"ad635ef2168e9224e25bcb4cfeb587b6","realm":"polaris_iceberg","request_id":null},"params":{}}
{"timestamp":1726172836633,"level":"DEBUG","thread":"pool-3-thread-58 - DELETE /api/catalog/v1/polaris/namespaces/test_schema1/tables/test2?purgeRequested=true","logger":"org.apache.polaris.core.auth.PolarisAuthorizer","message":"Satisfied privilege TABLE_WRITE_DATA with grantRecord PolarisGrantRec{securableCatalogId=0, securableId=11, granteeCatalogId=11, granteeId=12, privilegeCode=32} from securable entity:name=polaris;id=11;parentId=0;entityVersion=1;type=CATALOG;subType=NULL_SUBTYPE;internalProperties={catalogType=INTERNAL, storage_configuration_info={\"@type\":\"GcpStorageConfigurationInfo\",\"allowedLocations\":[\"gs://bucket-iceberg-polaris/polaris\"],\"storageType\":\"GCS\",\"fileIoImplClassName\":\"org.apache.iceberg.gcp.gcs.GCSFileIO\"}};grantRecordsAsGrantee:[];grantRecordsAsSecurable:[PolarisGrantRec{securableCatalogId=0, securableId=11, granteeCatalogId=11, granteeId=12, privilegeCode=2}, PolarisGrantRec{securableCatalogId=0, securableId=11, granteeCatalogId=11, granteeId=12, privilegeCode=31}, PolarisGrantRec{securableCatalogId=0, securableId=11, granteeCatalogId=11, granteeId=12, privilegeCode=32}] for principalName polarisTester and activatedIds [2, 12]","mdc":{"spanId":"0da7f8369a5be134","traceId":"ad635ef2168e9224e25bcb4cfeb587b6","realm":"polaris_iceberg","request_id":null},"params":{}}
{"timestamp":1726172836634,"level":"INFO","thread":"pool-3-thread-58 - DELETE /api/catalog/v1/polaris/namespaces/test_schema1/tables/test2?purgeRequested=true","logger":"org.apache.polaris.service.context.PolarisCallContextCatalogFactory","message":"Initializing new BasePolarisCatalog for key: polaris_iceberg/polaris","mdc":{"spanId":"0da7f8369a5be134","traceId":"ad635ef2168e9224e25bcb4cfeb587b6","realm":"polaris_iceberg","request_id":null},"params":{}}
{"timestamp":1726172836634,"level":"DEBUG","thread":"pool-3-thread-58 - DELETE /api/catalog/v1/polaris/namespaces/test_schema1/tables/test2?purgeRequested=true","logger":"org.apache.polaris.service.config.RealmEntityManagerFactory","message":"Looking up PolarisEntityManager for realm polaris_iceberg","mdc":{"spanId":"0da7f8369a5be134","traceId":"ad635ef2168e9224e25bcb4cfeb587b6","realm":"polaris_iceberg","request_id":null},"params":{}}
{"timestamp":1726172836634,"level":"INFO","thread":"pool-3-thread-58 - DELETE /api/catalog/v1/polaris/namespaces/test_schema1/tables/test2?purgeRequested=true","logger":"org.apache.polaris.service.context.PolarisCallContextCatalogFactory","message":"Looked up defaultBaseLocation gs://bucket-iceberg-polaris/polaris for catalog polaris_iceberg/polaris","mdc":{"spanId":"0da7f8369a5be134","traceId":"ad635ef2168e9224e25bcb4cfeb587b6","realm":"polaris_iceberg","request_id":null},"params":{}}
{"timestamp":1726172836634,"level":"DEBUG","thread":"pool-3-thread-58 - DELETE /api/catalog/v1/polaris/namespaces/test_schema1/tables/test2?purgeRequested=true","logger":"org.apache.polaris.service.catalog.BasePolarisCatalog","message":"Resolved ioImplClassName org.apache.iceberg.gcp.gcs.GCSFileIO for storageConfiguration GcpStorageConfigurationInfo{storageType=GCS, allowedLocation=[gs://bucket-iceberg-polaris/polaris], gcpServiceAccount=null}","mdc":{"spanId":"0da7f8369a5be134","traceId":"ad635ef2168e9224e25bcb4cfeb587b6","realm":"polaris_iceberg","request_id":null},"params":{}}
{"timestamp":1726172836634,"level":"DEBUG","thread":"pool-3-thread-58 - DELETE /api/catalog/v1/polaris/namespaces/test_schema1/tables/test2?purgeRequested=true","logger":"org.apache.polaris.service.catalog.BasePolarisCatalog","message":"Not initializing default catalogFileIO","mdc":{"spanId":"0da7f8369a5be134","traceId":"ad635ef2168e9224e25bcb4cfeb587b6","realm":"polaris_iceberg","request_id":null},"params":{}}
{"timestamp":1726172836634,"level":"DEBUG","thread":"pool-3-thread-58 - DELETE /api/catalog/v1/polaris/namespaces/test_schema1/tables/test2?purgeRequested=true","logger":"org.apache.polaris.service.catalog.BasePolarisCatalog","message":"new BasePolarisTableOperations for test_schema1.test2","mdc":{"spanId":"0da7f8369a5be134","traceId":"ad635ef2168e9224e25bcb4cfeb587b6","realm":"polaris_iceberg","request_id":null},"params":{}}
{"timestamp":1726172836634,"level":"DEBUG","thread":"pool-3-thread-58 - DELETE /api/catalog/v1/polaris/namespaces/test_schema1/tables/test2?purgeRequested=true","logger":"org.apache.polaris.service.catalog.BasePolarisCatalog","message":"doRefresh for tableIdentifier test_schema1.test2","mdc":{"spanId":"0da7f8369a5be134","traceId":"ad635ef2168e9224e25bcb4cfeb587b6","realm":"polaris_iceberg","request_id":null},"params":{}}
{"timestamp":1726172836635,"level":"DEBUG","thread":"pool-3-thread-58 - DELETE /api/catalog/v1/polaris/namespaces/test_schema1/tables/test2?purgeRequested=true","logger":"org.apache.polaris.core.persistence.resolver.PolarisResolutionManifest","message":"Returning resolvedEntities from getPassthroughResolvedPath: [entity:name=polaris;id=11;parentId=0;entityVersion=1;type=CATALOG;subType=NULL_SUBTYPE;internalProperties={catalogType=INTERNAL, storage_configuration_info={\"@type\":\"GcpStorageConfigurationInfo\",\"allowedLocations\":[\"gs://bucket-iceberg-polaris/polaris\"],\"storageType\":\"GCS\",\"fileIoImplClassName\":\"org.apache.iceberg.gcp.gcs.GCSFileIO\"}};grantRecordsAsGrantee:[];grantRecordsAsSecurable:[PolarisGrantRec{securableCatalogId=0, securableId=11, granteeCatalogId=11, granteeId=12, privilegeCode=2}, PolarisGrantRec{securableCatalogId=0, securableId=11, granteeCatalogId=11, granteeId=12, privilegeCode=31}, PolarisGrantRec{securableCatalogId=0, securableId=11, granteeCatalogId=11, granteeId=12, privilegeCode=32}], entity:name=test_schema1;id=13;parentId=11;entityVersion=1;type=NAMESPACE;subType=NULL_SUBTYPE;internalProperties={};grantRecordsAsGrantee:[];grantRecordsAsSecurable:[], entity:name=test2;id=15;parentId=13;entityVersion=2;type=TABLE_LIKE;subType=TABLE;internalProperties={metadata-location=gs://bucket-iceberg-polaris/polaris/test_schema1/test2-139045f9cbde4d1f9ef643141621a208/metadata/00001-89250289-fa69-4a2a-870e-441a9f4ad873.metadata.json, parent-namespace=test_schema1};grantRecordsAsGrantee:[];grantRecordsAsSecurable:[]]","mdc":{"spanId":"0da7f8369a5be134","traceId":"ad635ef2168e9224e25bcb4cfeb587b6","realm":"polaris_iceberg","request_id":null},"params":{}}
{"timestamp":1726172836635,"level":"DEBUG","thread":"pool-3-thread-58 - DELETE /api/catalog/v1/polaris/namespaces/test_schema1/tables/test2?purgeRequested=true","logger":"org.apache.polaris.service.catalog.BasePolarisCatalog","message":"Refreshing latestLocation: gs://bucket-iceberg-polaris/polaris/test_schema1/test2-139045f9cbde4d1f9ef643141621a208/metadata/00001-89250289-fa69-4a2a-870e-441a9f4ad873.metadata.json","mdc":{"spanId":"0da7f8369a5be134","traceId":"ad635ef2168e9224e25bcb4cfeb587b6","realm":"polaris_iceberg","request_id":null},"params":{}}
{"timestamp":1726172836636,"level":"INFO","thread":"pool-3-thread-58 - DELETE /api/catalog/v1/polaris/namespaces/test_schema1/tables/test2?purgeRequested=true","logger":"org.apache.iceberg.BaseMetastoreTableOperations","message":"Refreshing table metadata from new version: gs://bucket-iceberg-polaris/polaris/test_schema1/test2-139045f9cbde4d1f9ef643141621a208/metadata/00001-89250289-fa69-4a2a-870e-441a9f4ad873.metadata.json","mdc":{"spanId":"0da7f8369a5be134","traceId":"ad635ef2168e9224e25bcb4cfeb587b6","realm":"polaris_iceberg","request_id":null},"params":{}}
{"timestamp":1726172836636,"level":"INFO","thread":"pool-3-thread-58 - DELETE /api/catalog/v1/polaris/namespaces/test_schema1/tables/test2?purgeRequested=true","logger":"org.apache.iceberg.CatalogUtil","message":"Loading custom FileIO implementation: org.apache.iceberg.gcp.gcs.GCSFileIO","mdc":{"spanId":"0da7f8369a5be134","traceId":"ad635ef2168e9224e25bcb4cfeb587b6","realm":"polaris_iceberg","request_id":null},"params":{}}
{"timestamp":1726172836734,"level":"INFO","thread":"pool-3-thread-58 - DELETE /api/catalog/v1/polaris/namespaces/test_schema1/tables/test2?purgeRequested=true","logger":"org.apache.polaris.service.catalog.BasePolarisCatalog","message":"Scheduled cleanup task 16 for table test_schema1.test2","mdc":{"spanId":"0da7f8369a5be134","traceId":"ad635ef2168e9224e25bcb4cfeb587b6","realm":"polaris_iceberg","request_id":null},"params":{}}
{"timestamp":1726172836742,"level":"DEBUG","thread":"pool-3-thread-58 - DELETE /api/catalog/v1/polaris/namespaces/test_schema1/tables/test2?purgeRequested=true","logger":"org.apache.polaris.service.catalog.api.IcebergRestCatalogApi","message":"Completed execution of dropTable API with status code 200","mdc":{"spanId":"0da7f8369a5be134","traceId":"ad635ef2168e9224e25bcb4cfeb587b6","realm":"polaris_iceberg","request_id":null},"params":{}}
{"timestamp":1726172836743,"level":"INFO","thread":"pool-3-thread-58","logger":"io.opentelemetry.exporter.logging.LoggingSpanExporter","message":"'DELETE /api/catalog/v1/polaris/namespaces/test_schema1/tables/test2' : ad635ef2168e9224e25bcb4cfeb587b6 0da7f8369a5be134 SERVER [tracer: /api/catalog/v1/polaris/namespaces/test_schema1/tables/test2:] AttributesMap{data={realm=polaris_iceberg, url.scheme=http, server.address=polaris, url.path=/api/catalog/v1/polaris/namespaces/test_schema1/tables/test2, http.request.method=DELETE}, capacity=128, totalAddedValues=5}","mdc":{"request_id":null,"realm":"polaris_iceberg"},"params":{}}
{"timestamp":1726172836746,"level":"DEBUG","thread":"taskHandler-2","logger":"org.apache.polaris.core.storage.cache.StorageCredentialCache","message":"StorageCredentialCache::load","params":{}}
{"timestamp":1726172837097,"level":"INFO","thread":"taskHandler-2","logger":"org.apache.iceberg.CatalogUtil","message":"Loading custom FileIO implementation: org.apache.iceberg.gcp.gcs.GCSFileIO","params":{}}
{"timestamp":1726172837587,"level":"INFO","thread":"taskHandler-3","logger":"org.apache.iceberg.CatalogUtil","message":"Loading custom FileIO implementation: org.apache.iceberg.gcp.gcs.GCSFileIO","params":{}}
{"timestamp":1726172837671,"level":"INFO","thread":"taskHandler-2","logger":"org.apache.polaris.service.task.TaskExecutorImpl","message":"Task successfully handled","params":{"handlerClass":"org.apache.polaris.service.task.TableCleanupTaskHandler","taskEntityId":16}}
{"timestamp":1726172837976,"level":"DEBUG","thread":"taskHandler-3","logger":"org.apache.polaris.service.task.ManifestFileCleanupTaskHandler","message":"Scheduled 4 data files to be deleted from manifest gs://bucket-iceberg-polaris/polaris/test_schema1/test2-139045f9cbde4d1f9ef643141621a208/metadata/f9449a04-12db-4360-a6fe-f577830d075a-m0.avro","params":{}}
{"timestamp":1726172838303,"level":"INFO","thread":"","logger":"org.apache.polaris.service.task.ManifestFileCleanupTaskHandler","message":"All data files in manifest deleted - deleting manifest","params":{"manifestFile":"gs://bucket-iceberg-polaris/polaris/test_schema1/test2-139045f9cbde4d1f9ef643141621a208/metadata/f9449a04-12db-4360-a6fe-f577830d075a-m0.avro"}}
{"timestamp":1726172838478,"level":"INFO","thread":"taskHandler-3","logger":"org.apache.polaris.service.task.TaskExecutorImpl","message":"Task successfully handled","params":{"handlerClass":"org.apache.polaris.service.task.ManifestFileCleanupTaskHandler","taskEntityId":18}}

System information

Trino 449
Polaris git: 6fcf5ccaebd7ca13a0cb96c96adca699a24080a0

@loicalleyne loicalleyne added the bug Something isn't working label Sep 12, 2024
@eric-maynard
Copy link
Contributor

The delete is supposed to happen here. Can you share the steps you used to reproduce this with Trino?

@loicalleyne
Copy link
Author

@eric-maynard I ran the queries listed in the To Reproduce of the OP in Trino. Do you need Trino configuration information in addition to this?

@eric-maynard
Copy link
Contributor

Sure, or if you can reproduce this any other way (via calls to the CLI, or another engine) that works too! Just hoping to create a small minimally reproducing example to debug. A unit test would be ideal.

@flyrain
Copy link
Contributor

flyrain commented Sep 12, 2024

Polaris does not delete any historical metadata.json files, only the current one will be deleted.
This Iceberg PR you posted fixed it in Iceberg class CatalogUtil, but Polaris doesn't use it due to some reasons, like async deletion. To fix it, we can follow the approach in the PR to get a full set of file to delete in Polaris.

@flyrain flyrain added the good first issue Good for newcomers label Sep 12, 2024
@sfc-gh-ygu
Copy link
Contributor

specifically, we need the following two methods to get all metadata.json files and stats files

  private static Set<String> metadataLocations(TableMetadata tableMetadata) {
    Set<String> metadataLocations =
        tableMetadata.previousFiles().stream()
            .map(TableMetadata.MetadataLogEntry::file)
            .collect(Collectors.toSet());
    metadataLocations.add(tableMetadata.metadataFileLocation());
    return metadataLocations;
  }

  private static Set<String> statsLocations(TableMetadata tableMetadata) {
    return tableMetadata.statisticsFiles().stream()
        .map(StatisticsFile::path)
        .collect(Collectors.toSet());
  }

@loicalleyne
Copy link
Author

loicalleyne commented Sep 13, 2024

I've been building an Iceberg playground using my fork of insta-infra to bring up containerized Polaris, postgres, minio and more.
I'll push a repro setup to a repo tomorrow and post the link here.

@loicalleyne
Copy link
Author

I've just remembered that Polaris doesn't work with Minio due to lack of STS support. Do you know of any other service that can provide local S3 with STS? Or are you ok to use your own S3/GCS credentials for testing?

@flyrain
Copy link
Contributor

flyrain commented Sep 13, 2024

Do you know of any other service that can provide local S3 with STS?

Not as I know. May @snazy and @dimas-b know more options. BTW, can you open another issue for this question? It's a bit off-topic here.

@mayankvadariya
Copy link

I've just remembered that Polaris doesn't work with Minio due to lack of STS support. Do you know of any other service that can provide local S3 with STS? Or are you ok to use your own S3/GCS credentials for testing?

It appears Localstack has STS support.
https://docs.localstack.cloud/user-guide/aws/sts/

@danielhumanmod
Copy link
Contributor

Hi @flyrain, I am new to Polaris community and this task seems good to start with, may I work on this issue?

@flyrain
Copy link
Contributor

flyrain commented Sep 19, 2024

Feel free to take it, @danielhumanmod.

@danielhumanmod
Copy link
Contributor

danielhumanmod commented Sep 22, 2024

Hi @flyrain (@sfc-gh-ygu ) @eric-maynard, thank you for all the valuable context in the discussion. I have created a draft PR for this issue. Before it's ready for review, I list some points that need to be discussed in the "To be discussed" section in the PR, greatly appreciate it if you could provide some insight about these questions!

@flyrain
Copy link
Contributor

flyrain commented Sep 24, 2024

Thanks @danielhumanmod for working on it. We will only need to add support for the historical metadata.json files and stats files. Others are token care already.

@danielhumanmod
Copy link
Contributor

Hi Team, For this feature, we have some potential future improvements to work on:

  1. [Refactoring] Separate manifest files and metadata file batches, using a common base class.
  2. [Feature] Implement async deletion for partition statistics files.

I’m happy to keep working on these improvements if you’re okay with it :)

@flyrain
Copy link
Contributor

flyrain commented Dec 7, 2024

Hi @danielhumanmod, thanks for the contribution. Please feel free to do so. Don't hesitate to make small changes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working good first issue Good for newcomers
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants