-
Notifications
You must be signed in to change notification settings - Fork 151
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DROP TABLE with PURGE does not delete metadata.json files #289
Comments
The delete is supposed to happen here. Can you share the steps you used to reproduce this with Trino? |
@eric-maynard I ran the queries listed in the To Reproduce of the OP in Trino. Do you need Trino configuration information in addition to this? |
Sure, or if you can reproduce this any other way (via calls to the CLI, or another engine) that works too! Just hoping to create a small minimally reproducing example to debug. A unit test would be ideal. |
Polaris does not delete any historical metadata.json files, only the current one will be deleted. |
specifically, we need the following two methods to get all metadata.json files and stats files
|
I've been building an Iceberg playground using my fork of insta-infra to bring up containerized Polaris, postgres, minio and more. |
I've just remembered that Polaris doesn't work with Minio due to lack of STS support. Do you know of any other service that can provide local S3 with STS? Or are you ok to use your own S3/GCS credentials for testing? |
It appears Localstack has STS support. |
Hi @flyrain, I am new to Polaris community and this task seems good to start with, may I work on this issue? |
Feel free to take it, @danielhumanmod. |
Hi @flyrain (@sfc-gh-ygu ) @eric-maynard, thank you for all the valuable context in the discussion. I have created a draft PR for this issue. Before it's ready for review, I list some points that need to be discussed in the "To be discussed" section in the PR, greatly appreciate it if you could provide some insight about these questions! |
Thanks @danielhumanmod for working on it. We will only need to add support for the historical metadata.json files and stats files. Others are token care already. |
Hi Team, For this feature, we have some potential future improvements to work on:
I’m happy to keep working on these improvements if you’re okay with it :) |
Hi @danielhumanmod, thanks for the contribution. Please feel free to do so. Don't hesitate to make small changes. |
Is this a possible security vulnerability?
Describe the bug
When dropping a table, the data folder is deleted but the metadata folder remains with the metadata.json files it contains.
To Reproduce
Using postgres as the metadata store, and GCS for storage.
The queries below (executed in Trino) create a schema/namespace in the catalog, creates a table from some sample data in BigQuery(the data can be from anywhere really), copies the data to a second table, then drops the second table.
Actual Behavior
files in
{table name}/metadata
are not deletedExpected Behavior
All files in the dropped table's path are deleted.
Additional context
postgres as backing metadata store
GCS storage
System information
Trino 449
Polaris git: 6fcf5ccaebd7ca13a0cb96c96adca699a24080a0
The text was updated successfully, but these errors were encountered: