Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error using package in Databricks #3

Open
zakipatel opened this issue Aug 22, 2016 · 3 comments
Open

Error using package in Databricks #3

zakipatel opened this issue Aug 22, 2016 · 3 comments

Comments

@zakipatel
Copy link

I am getting the following error:

'module' object has no attribute 'view keys

I am running python 2.7.10, and installed using pip install spark-df-profiling in Databricks (Spark 2.0)

I am able to import the module, but when I pass a data frame to the profiler, i get the above error (see attached)
screen shot 2016-08-22 at 2 38 33 pm

@julioasotodv
Copy link
Owner

Hi zakipatel:

Yes, you're right. It looks like the problem is that the version of the default six library installed in Databricks is very old:

import six
six.__version__

Still, I don't know whether you can upgrade the version of a Python package in Databricks...

@julioasotodv
Copy link
Owner

julioasotodv commented Aug 22, 2016

According to Databricks docs, you just have to create a new library, and in the Pypi name write six==1.10.0. Then re-attach your notebook to the cluster (either that or restart it), and it will work.

On the other hand, I am looking how can be HTML displayed nicely in Databricks Notebooks...

This should work for you:

import spark_df_profiling
df = sqlContext.createDataFrame([["2",True,None,"8"],
                                 ["2",False,None,"8"],
                                 ["2",True,"5","7"]], ["a","b","c","d"])
rep = spark_df_profiling.ProfileReport(df)
displayHTML(rep.html)

But since JQuery is not being loaded, it doesn't look good at all. I will try to reach Databricks guys and come back with a solution...

@zakipatel
Copy link
Author

Thanks! I created the new library as you instructed, and the error is resolved. The displayHTML also works, however, the output is not as nicely formatted, and the toggle details does not work. There is a a way to make javaScript libraries available to DisplayHTML in Databricks (either by putting the library on the FileStore or using https in the source link for the Jquery lib import).

Is there a recommended way to resolve the UI layout in Databricks that I can try?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants