Skip to content

Lazy-import of all popular Python Data Science libraries

License

Notifications You must be signed in to change notification settings

MlataIbrahim/pyforest

 
 

Repository files navigation

pyforest - feel the bliss of automatic importing

pyforest lazy-imports all popular Python Data Science libraries so that they are always there when you need them. Once you use a package, pyforest imports it and even adds the import statement to your first Jupyter cell. If you don't use a library, it won't be imported.

Demo in Jupyter Notebook

demo

Using pyforest

After you installed pyforest and its Jupyter extension, you can use your favorite Python Data Science commands like you normally would - just without writing imports.

For example, if you want to read a CSV with pandas:

df = pd.read_csv("titanic.csv")

pyforest will automatically import pandas for you and add the import statement to the first cell:

import pandas as pd

Which libraries are available?

  • We aim to add all popular Python Data Science libraries which should account for >99% of your daily imports. For example, we already added pandas as pd, numpy as np, seaborn as sns, matplotlib.pyplot as plt, or OneHotEncoder from sklearn and many more. In addition, there are also helper modules like os, re, tqdm, or Path from pathlib.
  • You can see an overview of all currently available imports here
  • If you are missing an import, you can either add the import to your user specific pyforest imports as described in the FAQs or you can open a pull request for the official pyforest imports

In order to gather all the most important names, we need your help. Please open a pull request and add the imports that we are still missing.

Installation

You need Python 3.6 or above because we love f-strings.

From the terminal (or Anaconda prompt in Windows), enter:

pip install --upgrade pyforest
python -m pyforest install_extensions

And you're ready to go.

Please note, that this will also add pyforest to your IPython default startup settings. If you do not want this, you can disable the auto_import as described in the FAQs below.

Frequently Asked Questions

  • "How to add my own import statements without adding them to the package source code?"

    • pyforest creates a file in your home directory at ~/.pyforest/user_imports.py in which you can type any explicit import statements you want (e.g. import pandas as pd). Your own custom imports take precedence over any other pyforest imports. Please note: implicit imports (e.g. from pandas import *) won't work.
  • "Doesn't this slow down my Jupyter or Python startup process?"

    • No, because the libraries will only be imported when you actually use them. Until you use them, the variables like pd are only pyforest placeholders.
  • "Why can't I just use the typical IPython import?"

    • If you were to add all the libraries that pyforest includes, your startup time might take more than 30s.
  • "I don't have and don't need tensorflow. What will happen when I use pyforest?"

    • Tensorflow is included in pyforest but pyforest does not install any dependencies. You need to install your libraries separately from pyforest. Afterwards, you can access the libraries via pyforest if they are included in the pyforest imports.
  • "Will the pyforest variables interfere with my own local variables?"

    • No, never. pyforest will never mask or overwrite any of your local variables. You can use your variables like you would without pyforest. The worst thing that can happen is that you overwrite a pyforest placeholder and thus cannot use the placeholder any more (duh).
  • "What about auto-completion on lazily imported modules?"

    • It works :) As soon as you start the auto-completion, pyforest will import the module and return the available symbols to your auto-completer.
  • "How to (temporarily) deactivate the auto_import in IPython and Jupyter?"

    • Go to the directory ~/.ipython/profile_default/startup and adjust or delete the pyforest_autoimport.py file. You will find further instructions in the file. If you don't use the auto_import, you will need to import pyforest at the beginning of your notebook via import pyforest
  • "How to (re)activate the pyforest auto_import?"

    • Execute the following Python command in Jupyter, IPython or Python: from pyforest.auto_import import setup; setup(). Please note that the auto_import only works for Jupyter and IPython.
  • "Can I use pyforest outside of the Jupyter Notebook or Lab?"

    • Technically, yes. However, this is not the intended use case. pyforest is aimed primarily for the use in a Jupyter Notebook or Lab. If you want to use pyforest in IPython or a Python script etc, please import it as follows import pyforest. Afterwards, you can get the currently active imports via pyforest.active_imports()
  • "Why is the project called pyforest?"

    • pyforest is created to be the home for all Data Science packages - including pandas. And in which ecosystems do pandas live? :)

Contributing

In order to gather all the most important names, we need your help. Please open a pull request and add the imports that we are still missing to the pyforest imports. You can also find the guidelines in the pyforest imports file

Using pyforest as Package Developer

pyforest helps you to minimize the (initial) import time of your package which improves the user experience. If you want your package imports to become lazy, rewrite your imports as follows:

Replace

import pandas as pd

with

from pyforest import LazyImport
pd = LazyImport("import pandas as pd")

About

pyforest is developed by 8080 Labs. Our goal is to improve the productivity of Python Data Scientists. If you like the speedup to your workflow, you might also be interested in our other project bamboolib

Join our community and grow further

If you

  • like our work or
  • want to become a faster Python Data Scientist or
  • want to discuss the future of the Python Data Science ecosystem or
  • are just interested in mingling with like-minded fellows

then, you are invited to join our slack.

About

Lazy-import of all popular Python Data Science libraries

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages

  • Python 82.4%
  • Shell 10.3%
  • JavaScript 6.9%
  • Batchfile 0.4%