-
Notifications
You must be signed in to change notification settings - Fork 149
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG] ClassNotFoundException for 'excel.DefaultSource' while using API V2 #789
Comments
having the same problem after install com.crealytics:spark-excel_2.12:3.4.1_0.20.1 from Maven in Azure Databricks cluster with runtime version 13.3 LTS (includes Apache Spark 3.4.1, Scala 2.12) Confirm working after switch to com.crealytics:spark-excel_2.12:3.4.1_0.19.0 |
@nightscape Could you take a look, please? |
spark-excel_2.12:3.4.1_0.19.0 YES. USING:
// .option("useHeader", "true")
|
Could somebody look into this? I'll only get around to have a look at it in ~1 month because we're in the last stages of house construction and then moving... |
Same here. I was finally trying to update our spark from 3.3 to 3.4 and stumbled over the same issue. It seems to be related to the change from spark 3.3. to 3.4. and for me it is not related to the actual spark excel package version (0.19 and up are all failing even if it works for others). Will look into it... |
I get
when trying to do spark.read.format("com.crealytics.spark.excel") in pyspark with spark 3.5.0 Scala 2.12. I guess, it is because of this issue. Is there any update on this? Seems like a packaging error. At this moment the package is unusable at the latest version which is the only spark 3.5.* version. |
Since it is a packaging error I believe it is a mill issue. There were some changes in build.sc since 0.19 as well as an update from mill 0.11.4 to 0.11.5. Unfortunately I am no mill expert nor have made working in intellij (at least my first tries were pretty unsuccessfull). Keep on trying but if some mill expert could help that would be great |
In the meantime you could try the spark=3.4.1 spark excel=0.19.0 version of spark excel with spark 3.5 => 3.4.1_0.19.0 and spark.read.format("excel"). It could work because there were no datasourcev2 API changes from 3.4 to 3.5... |
@christianknoepfle thanks for the advise and the efforts. I temporarily downgraded my cluster to 3.4.1. |
The commit introducing the issue seems to be e911d0cf8bd5465f7a3f82289c50045556ba6c91, which is a little bit surprising because it contains only the minimal changes to update Mill. |
The incorrect JAR files issue should be solved in |
Is there an existing issue for this?
Current Behavior
I am using Spark version 3.5.0 with scala version 2.13.
I am getting java.lang.ClassNotFoundException: excel.DefaultSource for following line of code
Dataset<Row> df = spark.read().format("excel").option("header", "true").load(path);
I have also tried following code but got similar error - ClassNotFoundException: com.crealytics.spark.excel.DefaultSource
Dataset<Row> df = spark.read().format("com.crealytics.spark.excel").option("header", true).load(path);
I have introspected the jar file spark-excel_2.13-3.5.0_0.20.1.jar but it is missing the package com.crealytics.spark.excel.
Expected Behavior
No response
Steps To Reproduce
No response
Environment
Anything else?
No response
The text was updated successfully, but these errors were encountered: