You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
@ctslater and I have been investigating an issue where it seems that more than one connection is being made by Spark to the Hive metastore when backed by a local Derby database. The Derby database only allows a single connection at a time, so a crash occurs when a second connection is attempted. On epyc, we use a shared MySQL database, so it seems we are blind to this issue between version changes. This bug originally appeared on our AWS JupyterHub where each user is using a local Derby database instead of a shared one.
The following is enough to reproduce the bug with Spark 3.0.0 and to work without the bug for Spark 2.4.0:
from axs import AxsCatalog, Constants
db = AxsCatalog(spark)
db.import_existing_table(
"ztf",
"/epyc/users/stevengs/ztf_oct19_small",
num_buckets=500,
zone_height=Constants.ONE_AMIN,
import_into_spark=True
)
@ctslater to reproduce this on epyc, navigate to: /epyc/users/stevengs/spark-testing and do
source 2.4.0/env.sh
pyspark
and copy in the code above, which should work. Then do
source 3.0.0/env.sh
pyspark
and copy in the code above again, which should fail with a traceback like:
Caused by: ERROR XJ040: Failed to start database '/epyc/users/stevengs/axs/metastore_db' with class loader org.apache.spark.sql.hive.client.IsolatedClientLoader$$anon$1@158bc877, see the next exception for details.
at org.apache.derby.iapi.error.StandardException.newException(Unknown Source)
at org.apache.derby.impl.jdbc.SQLExceptionFactory.wrapArgsForTransportAcrossDRDA(Unknown Source)
... 115 more
Caused by: ERROR XSDB6: Another instance of Derby may have already booted the database /data/epyc/users/stevengs/axs/metastore_db.
at org.apache.derby.iapi.error.StandardException.newException(Unknown Source)
at org.apache.derby.iapi.error.StandardException.newException(Unknown Source)
at org.apache.derby.impl.store.raw.data.BaseDataFileFactory.privGetJBMSLockOnDB(Unknown Source)
at org.apache.derby.impl.store.raw.data.BaseDataFileFactory.run(Unknown Source)
at java.security.AccessController.doPrivileged(Native Method)
at org.apache.derby.impl.store.raw.data.BaseDataFileFactory.getJBMSLockOnDB(Unknown Source)
at org.apache.derby.impl.store.raw.data.BaseDataFileFactory.boot(Unknown Source)
at org.apache.derby.impl.services.monitor.BaseMonitor.boot(Unknown Source)
at org.apache.derby.impl.services.monitor.TopService.bootModule(Unknown Source)
at org.apache.derby.impl.services.monitor.BaseMonitor.startModule(Unknown Source)
at org.apache.derby.impl.services.monitor.FileMonitor.startModule(Unknown Source)
at org.apache.derby.iapi.services.monitor.Monitor.bootServiceModule(Unknown Source)
at org.apache.derby.impl.store.raw.RawStore$6.run(Unknown Source)
at java.security.AccessController.doPrivileged(Native Method)
at org.apache.derby.impl.store.raw.RawStore.bootServiceModule(Unknown Source)
at org.apache.derby.impl.store.raw.RawStore.boot(Unknown Source)
at org.apache.derby.impl.services.monitor.BaseMonitor.boot(Unknown Source)
at org.apache.derby.impl.services.monitor.TopService.bootModule(Unknown Source)
at org.apache.derby.impl.services.monitor.BaseMonitor.startModule(Unknown Source)
at org.apache.derby.impl.services.monitor.FileMonitor.startModule(Unknown Source)
at org.apache.derby.iapi.services.monitor.Monitor.bootServiceModule(Unknown Source)
at org.apache.derby.impl.store.access.RAMAccessManager$5.run(Unknown Source)
at java.security.AccessController.doPrivileged(Native Method)
at org.apache.derby.impl.store.access.RAMAccessManager.bootServiceModule(Unknown Source)
at org.apache.derby.impl.store.access.RAMAccessManager.boot(Unknown Source)
at org.apache.derby.impl.services.monitor.BaseMonitor.boot(Unknown Source)
at org.apache.derby.impl.services.monitor.TopService.bootModule(Unknown Source)
at org.apache.derby.impl.services.monitor.BaseMonitor.startModule(Unknown Source)
at org.apache.derby.impl.services.monitor.FileMonitor.startModule(Unknown Source)
at org.apache.derby.iapi.services.monitor.Monitor.bootServiceModule(Unknown Source)
at org.apache.derby.impl.db.BasicDatabase$5.run(Unknown Source)
at java.security.AccessController.doPrivileged(Native Method)
at org.apache.derby.impl.db.BasicDatabase.bootServiceModule(Unknown Source)
at org.apache.derby.impl.db.BasicDatabase.bootStore(Unknown Source)
at org.apache.derby.impl.db.BasicDatabase.boot(Unknown Source)
at org.apache.derby.impl.services.monitor.BaseMonitor.boot(Unknown Source)
at org.apache.derby.impl.services.monitor.TopService.bootModule(Unknown Source)
at org.apache.derby.impl.services.monitor.BaseMonitor.bootService(Unknown Source)
at org.apache.derby.impl.services.monitor.BaseMonitor.startProviderService(Unknown Source)
at org.apache.derby.impl.services.monitor.BaseMonitor.findProviderAndStartService(Unknown Source)
at org.apache.derby.impl.services.monitor.BaseMonitor.startPersistentService(Unknown Source)
at org.apache.derby.iapi.services.monitor.Monitor.startPersistentService(Unknown Source)
at org.apache.derby.impl.jdbc.EmbedConnection$4.run(Unknown Source)
at org.apache.derby.impl.jdbc.EmbedConnection$4.run(Unknown Source)
at java.security.AccessController.doPrivileged(Native Method)
at org.apache.derby.impl.jdbc.EmbedConnection.startPersistentService(Unknown Source)
... 112 more
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "<stdin>", line 6, in <module>
File "/epyc/opt/spark-axs-3.0.0-beta/python/axs/catalog.py", line 83, in import_existing_table
self.spark.catalog.createTable(table_name, path, "parquet")
File "/epyc/opt/spark-axs-3.0.0-beta/python/pyspark/sql/catalog.py", line 162, in createTable
df = self._jcatalog.createTable(tableName, source, options)
File "/epyc/opt/spark-axs-3.0.0-beta/python/lib/py4j-0.10.8.1-src.zip/py4j/java_gateway.py", line 1286, in __call__
File "/epyc/opt/spark-axs-3.0.0-beta/python/pyspark/sql/utils.py", line 102, in deco
raise converted
pyspark.sql.utils.AnalysisException: java.lang.RuntimeException: java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient;
the source files simply change SPARK_HOME, PATH, and SPARK_CONF_DIR to point to the right version of Spark on epyc and also to load special configurations in hive-site.xml that specify using a local Derby database instead of the database running on epyc.
I seem to have gotten it to work by adding --conf spark.sql.hive.metastore.sharedPrefixes=org.apache.derby to the pyspark command line; want to check if that also works for you?
It looks like this fixes the issue for me on both Epyc and the cloud system, thanks @ctslater. I guess this should be a default configuration in the spark-defaults.conf that we ship with AXS?
@ctslater and I have been investigating an issue where it seems that more than one connection is being made by Spark to the Hive metastore when backed by a local Derby database. The Derby database only allows a single connection at a time, so a crash occurs when a second connection is attempted. On epyc, we use a shared MySQL database, so it seems we are blind to this issue between version changes. This bug originally appeared on our AWS JupyterHub where each user is using a local Derby database instead of a shared one.
The following is enough to reproduce the bug with Spark 3.0.0 and to work without the bug for Spark 2.4.0:
@ctslater to reproduce this on epyc, navigate to:
/epyc/users/stevengs/spark-testing
and doand copy in the code above, which should work. Then do
and copy in the code above again, which should fail with a traceback like:
the source files simply change
SPARK_HOME
,PATH
, andSPARK_CONF_DIR
to point to the right version of Spark on epyc and also to load special configurations inhive-site.xml
that specify using a local Derby database instead of the database running on epyc.The following is the
hive-site.xml
for 2.4.0and the following is the
hive-site.xml
for 3.0.0The text was updated successfully, but these errors were encountered: