Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hive: Unwrap RuntimeException for Hive TException with rename table #9432

Merged
merged 1 commit into from
Jan 17, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -251,9 +251,14 @@ public void renameTable(TableIdentifier from, TableIdentifier originalTo) {
} catch (NoSuchObjectException e) {
throw new NoSuchTableException("Table does not exist: %s", from);

} catch (AlreadyExistsException e) {
throw new org.apache.iceberg.exceptions.AlreadyExistsException(
"Table already exists: %s", to);
} catch (InvalidOperationException e) {
if (e.getMessage() != null
&& e.getMessage().contains(String.format("new table %s already exists", to))) {
throw new org.apache.iceberg.exceptions.AlreadyExistsException(
"Table already exists: %s", to);
} else {
throw new RuntimeException("Failed to rename " + from + " to " + to, e);
}
Comment on lines +255 to +261
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could we please move this under a catch (InvalidOperationException)?
But keep the old TException handler part


} catch (TException e) {
throw new RuntimeException("Failed to rename " + from + " to " + to, e);
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,7 @@
import org.apache.iceberg.common.DynMethods;
import org.apache.iceberg.relocated.com.google.common.collect.ImmutableMap;
import org.apache.iceberg.relocated.com.google.common.collect.Maps;
import org.apache.thrift.TException;

public class MetastoreUtil {
private static final DynMethods.UnboundMethod ALTER_TABLE =
Expand Down Expand Up @@ -54,7 +55,7 @@ private MetastoreUtil() {}
* context will be set in a way that turns off stats updates to avoid recursive file listing.
*/
public static void alterTable(
IMetaStoreClient client, String databaseName, String tblName, Table table) {
IMetaStoreClient client, String databaseName, String tblName, Table table) throws TException {
alterTable(client, databaseName, tblName, table, ImmutableMap.of());
}

Expand All @@ -67,11 +68,21 @@ public static void alterTable(
String databaseName,
String tblName,
Table table,
Map<String, String> extraEnv) {
Map<String, String> extraEnv)
throws TException {
Map<String, String> env = Maps.newHashMapWithExpectedSize(extraEnv.size() + 1);
env.putAll(extraEnv);
env.put(StatsSetupConst.DO_NOT_UPDATE_STATS, StatsSetupConst.TRUE);

ALTER_TABLE.invoke(client, databaseName, tblName, table, new EnvironmentContext(env));
try {
ALTER_TABLE.invoke(client, databaseName, tblName, table, new EnvironmentContext(env));
} catch (RuntimeException e) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How widely is alterTable used?
Wouldn't it be better (more general solution) to expose the wrapped MetaException instead?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's only being used at two places:

  1. To update the table .
  2. To rename the table .

Are you proposing to throw non RTE MetaException at alter table . something like

catch (RuntimeException e) {
      if (e.getCause() instanceof InvalidOperationException
          && e.getCause().getMessage() != null
          && e.getCause()
              .getMessage()
              .contains(
                  String.format(
                      "new table %s.%s already exists", table.getDbName(), table.getTableName()))) {
        throw new MetaException(
            String.format("Table already exists: %s.%s", table.getDbName(), table.getTableName()));
      }
      throw e;
    }

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I proposing something like this:

catch (RuntimeException e) {
      if (e.getCause() instanceof MetaException) {
        throw (MetaException) e.getCause()
      } else {
        throw e;
      }
    }

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any thoughts on the test case testRenameTableDestinationTableAlreadyExists since it expects error message as

.isInstanceOf(AlreadyExistsException.class)
        .hasMessageContaining("Table already exists");

Currently we have override this as

.isInstanceOf(RuntimeException.class)
        .hasMessageContaining("new table newdb.table_renamed already exists");

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

alterTable should return the MetaException and where we call it, we should handle the InvalidOperationException with the message of new table %s.%s already exists and translate it to an AlreadyExistsException.

We should only consider moving the exception if we find that all of the places where we call alterTable we would like to throw the same AlreadyExistsException

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

InvalidOperationException still will be wrapped with RuntimeException . So basically we will have to catch RuntimeException and unwrap with AlreadyExistsException .

Or Am I missing something here ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not unwrap RTE with both MetaException as well as InvalidOperationException and translate to org.apache.hadoop.hive.metastore.api.AlreadyExistsException . Since all the callers of alterTable is already handling org.apache.hadoop.hive.metastore.api.AlreadyExistsException.
I am thinking something like this:

catch (RuntimeException e) {
      // MetaException would be wrapped into RuntimeException during reflection
      if (e.getCause() instanceof MetaException) {
        throw (MetaException) e.getCause();
      }
      // InvalidOperationException would be wrapped into RuntimeException during reflection, as
      // java.lang.RuntimeException:InvalidOperationException(message:new table <> already exists)
      if (e.getCause() instanceof InvalidOperationException
              && e.getCause().getMessage() != null
              && e.getCause()
              .getMessage()
              .contains(
                      String.format(
                              "new table %s.%s already exists", table.getDbName(), table.getTableName()))) {
        throw new AlreadyExistsException(
                String.format("Table already exists: %s.%s", table.getDbName(), table.getTableName()));
      }
      throw e;
    }

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the HiveTableOperations we are accustomed to handle Hive MetaExceptions, so I think handling InvalidOperationException there would be "natural".

The alterTable wraps the exceptions to a RuntimeException because of the DynMethod calls. I think this is a mistake. We should behave the same way as the metaClients.run(client -> client...) like calls, we should throw the original MetaException - my opinion here could be changes if this becomes a breaking change somewhere, but if it does not break code for someone, then we should change it.

Copy link
Contributor

@pvary pvary Jan 10, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I made a mistake in the above command, I thought, that the root exception is MetaException, but in reality it is TException.

So fixing this issue, my comment correctly is below:


The alterTable wraps the exceptions to a RuntimeException because of the DynMethod calls. I think this is a mistake. We should behave the same way as the metaClients.run(client -> client...) like calls, we should throw the original TException - my opinion here could be changes if this becomes a breaking change somewhere, but if it does not break code for someone, then we should change it.

// TException would be wrapped into RuntimeException during reflection
if (e.getCause() instanceof TException) {
throw (TException) e.getCause();
} else {
throw e;
}
}
}
}
Original file line number Diff line number Diff line change
Expand Up @@ -52,7 +52,6 @@
import org.apache.iceberg.DataFile;
import org.apache.iceberg.DataFiles;
import org.apache.iceberg.FileFormat;
import org.apache.iceberg.HasTableOperations;
import org.apache.iceberg.PartitionSpec;
import org.apache.iceberg.PartitionSpecParser;
import org.apache.iceberg.Schema;
Expand Down Expand Up @@ -82,7 +81,6 @@
import org.apache.iceberg.types.Types;
import org.apache.iceberg.util.JsonUtil;
import org.apache.thrift.TException;
import org.assertj.core.api.Assertions;
import org.junit.jupiter.api.AfterEach;
import org.junit.jupiter.api.BeforeEach;
import org.junit.jupiter.api.Test;
Expand Down Expand Up @@ -1181,55 +1179,4 @@ public void testDatabaseLocationWithSlashInWarehouseDir() {

assertThat(database.getLocationUri()).isEqualTo("s3://bucket/database.db");
}

// TODO: This test should be removed after fix of https://github.com/apache/iceberg/issues/9289.
@Test
@Override
public void testRenameTableDestinationTableAlreadyExists() {
Namespace ns = Namespace.of("newdb");
TableIdentifier renamedTable = TableIdentifier.of(ns, "table_renamed");

if (requiresNamespaceCreate()) {
catalog.createNamespace(ns);
}

Assertions.assertThat(catalog.tableExists(TABLE))
.as("Source table should not exist before create")
.isFalse();

catalog.buildTable(TABLE, SCHEMA).create();
Assertions.assertThat(catalog.tableExists(TABLE))
.as("Source table should exist after create")
.isTrue();

Assertions.assertThat(catalog.tableExists(renamedTable))
.as("Destination table should not exist before create")
.isFalse();

catalog.buildTable(renamedTable, SCHEMA).create();
Assertions.assertThat(catalog.tableExists(renamedTable))
.as("Destination table should exist after create")
.isTrue();

// With fix of issues#9289,it should match with CatalogTests and expect
// AlreadyExistsException.class
// and message should contain as "Table already exists"
Assertions.assertThatThrownBy(() -> catalog.renameTable(TABLE, renamedTable))
.isInstanceOf(RuntimeException.class)
.hasMessageContaining("new table newdb.table_renamed already exists");
Assertions.assertThat(catalog.tableExists(TABLE))
.as("Source table should still exist after failed rename")
.isTrue();
Assertions.assertThat(catalog.tableExists(renamedTable))
.as("Destination table should still exist after failed rename")
.isTrue();

String sourceTableUUID =
((HasTableOperations) catalog.loadTable(TABLE)).operations().current().uuid();
String destinationTableUUID =
((HasTableOperations) catalog.loadTable(renamedTable)).operations().current().uuid();
Assertions.assertThat(sourceTableUUID)
.as("Source and destination table should remain distinct after failed rename")
.isNotEqualTo(destinationTableUUID);
}
}