-
Notifications
You must be signed in to change notification settings - Fork 3.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Ensure "nessie.commit.id table" property is set when updating the table #19524
Conversation
...roduct-tests/src/main/java/io/trino/tests/product/iceberg/TestIcebergSparkCompatibility.java
Show resolved
Hide resolved
onSpark().executeQuery("CREATE TABLE " + sparkTableName + " (a INT, b INT, c INT) USING ICEBERG"); | ||
onSpark().executeQuery("INSERT INTO " + sparkTableName + " VALUES (1, 2, 3)"); | ||
|
||
onTrino().executeQuery("INSERT INTO " + trinoTableName + " VALUES (4, 5, 6)"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This used to fail for Nessie and there was no test case to cover this case.
cc: @findepi, @ebyhr, @findinpath |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice catch, @ajantha-bhat !
@findinpath and @findepi: Can I please get a review on this? This is a small PR. |
...no-iceberg/src/main/java/io/trino/plugin/iceberg/catalog/AbstractIcebergTableOperations.java
Show resolved
Hide resolved
String sparkTableName = sparkTableName(baseTableName); | ||
String trinoTableName = trinoTableName(baseTableName); | ||
|
||
onSpark().executeQuery("CREATE TABLE " + sparkTableName + " (a INT, b INT, c INT) USING ICEBERG"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i think we can stick to one column - there's no need to use multiple columns to test this change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
String trinoTableName = trinoTableName(baseTableName); | ||
|
||
onSpark().executeQuery("CREATE TABLE " + sparkTableName + " (a INT, b INT, c INT) USING ICEBERG"); | ||
onSpark().executeQuery("INSERT INTO " + sparkTableName + " VALUES (1, 2, 3)"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you want to ensure that the nessie.commit.id
gets updated ?
public static String getNessieCommitId(String tableName)
{
String propertiesTableName = "\"" + baseTableName + "$properties\"";
String trinoPropertiesTableName = trinoTableName(propertiesTableName);
return (String) onTrino()
.executeQuery("SELECT value FROM " + trinoPropertiesTableName + " WHERE key = 'nessie.commit.id'")
.getOnlyValue();
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This test is common for other catalogs (groups) too. So, I didn't want to have a catalog (group) specific checks.
onSpark().executeQuery("INSERT INTO " + sparkTableName + " VALUES (1)"); | ||
|
||
onTrino().executeQuery("INSERT INTO " + trinoTableName + " VALUES (2)"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
onSpark().executeQuery("INSERT INTO " + sparkTableName + " VALUES (1)"); | |
onTrino().executeQuery("INSERT INTO " + trinoTableName + " VALUES (2)"); | |
onSpark().executeQuery("INSERT INTO " + sparkTableName + " VALUES 1"); | |
onTrino().executeQuery("INSERT INTO " + trinoTableName + " VALUES 2"); |
Spark sets the table property NESSIE_COMMIT_ID_PROPERTY in NessieTableOperations#loadTableMetadata. Then NessieIcebergClient.commitTable uses this property. In Trino, this property is never set but used in NessieIcebergClient.commitTable as it is a common code. Hence, the commit id is old and doesn't allow new commits. Use the common code (available From Iceberg 1.4.0) NessieUtil.updateTableMetadataWithNessieSpecificProperties in Trino, which handles setting the property like "nessie.commit.id".
The failed test |
Description
Spark sets the table property
NESSIE_COMMIT_ID_PROPERTY
inNessieTableOperations#loadTableMetadata
. ThenNessieIcebergClient.commitTable
uses this property.In Trino, this property is never set but used in
NessieIcebergClient.commitTable
as it is a common code. Hence, the commit id is old and doesn't allow new commits.Use the common code (available From Iceberg 1.4.0)
NessieUtil.updateTableMetadataWithNessieSpecificProperties
in Trino, which handles setting the property like"nessie.commit.id"
.Additional context and related issues
Fixes #17813
Release notes
(x) This is not user-visible or is docs only, and no release notes are required.
( ) Release notes are required. Please propose a release note for me.
( ) Release notes are required, with the following suggested text: