-
Notifications
You must be signed in to change notification settings - Fork 2.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Core: Don't reset snapshotLog in TableMetadata.removeRef
method
#11779
base: main
Are you sure you want to change the base?
Conversation
81647f9
to
131b64e
Compare
core/src/main/java/org/apache/iceberg/MetadataUpdateParser.java
Outdated
Show resolved
Hide resolved
core/src/main/java/org/apache/iceberg/MetadataUpdateParser.java
Outdated
Show resolved
Hide resolved
57a1ec2
to
d84eb2c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ebyhr I think we may want to take a different approach.
Currently, we clear the snapshot log when TableMetadata#removeRef is called here . If you see, the non-REST case goes through resetMain
which won't clear the snapshot log but for the REST changes that are produced, we produce a RemoveSnapshotRef update type which when applied server side clears the log.
Quite simply, I think we probably just want to remove the snapshotLog.clear()
call in the removeRef API. In the past, I think there was concern that a user using that API may somehow rely on that behavior but really the only action on an Iceberg table that should remove history is snapshot expiration. removeRef(MAIN)
should not really change history in anyway, this was probably just an oversight in the implementation done a long time ago.
I also discussed this issue with a few others @Fokko @nastra @rdblue @danielcweeks and I think we're on the same page there.
What are your thoughts? If you agree, would be happy to review the PR with the change to stop clearing the snapshot log, and some tests to verify that history is preserved during a REPLACE, and we can get a fix for this issue in!
d84eb2c
to
9c2ca14
Compare
9c2ca14
to
4e55282
Compare
TableMetadata.removeRef
method
@amogh-jahagirdar There is no reason to disagree from my perspective. I was simply trying to keep the existing behavior as much as possible because I didn't know the historical reason. Reverted most changes and removed |
@@ -2418,6 +2421,34 @@ public void testPaginationForListTables(int numberOfItems) { | |||
eq(ListTablesResponse.class)); | |||
} | |||
|
|||
@Test | |||
public void testReplaceTableKeepsSnapshotLog() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we put this test in CatalogTests
? That'll cover both REST + non-REST cases (TestRESTCatalog extends CatalogTests
).
table.refresh(); | ||
assertThat(((BaseTable) table).operations().current().snapshotLog()) | ||
.hasSize(2) | ||
.containsAll(snapshotLogBeforeReplace); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we assert the exact snapshot log contents instead of just doing a containsAll check? We should have both the log before the replace and the new snapshot produced as part of the replace.
Fixes #11777