Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error running on Spark 1.5.2 (on Databricks) #4

Open
zakipatel opened this issue Sep 1, 2016 · 2 comments
Open

Error running on Spark 1.5.2 (on Databricks) #4

zakipatel opened this issue Sep 1, 2016 · 2 comments

Comments

@zakipatel
Copy link

On Spark 2.0, I am able to successfully install the python library (%sh pip install spark-df-profiling) , run the import command (import spark_df_profiling) and (report = spark_df_profiling.ProfileReport(df)

However, this does not work on Spark 1.5.2. As per the documentation on Github, it says that Spark 1.5 + is supported and is compatible with 2.0. Is there something that I need to do in addition to the pip install and setting up the Six===1.10.0 library ?

See screenshot below - this works just fine in Spark 2.0 (on Databricks, but get the error in 1.5.2)

screen shot 2016-09-01 at 4 05 59 pm

@julioasotodv
Copy link
Owner

Hi,

This may happen if you don't "reload" the notebook after installing the library.

Moreover, the recommended way to install Python libraries in Databricks is by attaching them rather than using pip with the %sh command.

Let me know if that helped you.

@SaurabhCK
Copy link

use dbutils
dbutils.library.installPyPI("spark-df-profiling","1.1.2")
this should work

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants