-
Notifications
You must be signed in to change notification settings - Fork 174
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
[PERF] Lazily import heavy modules to speed up import times (#2826)
Introduce lazy imports for heavy modules that are not needed as top-level imports. For example, `ray` does not need to be a top level import (it should only be imported when using the ray runner or when specific ray data extension types needed. Another example would be `UnityCatalogTable`, which is a relatively heavy import despite only being needed when using delta lake. Modules to import lazily were determined by the proportion of import time as shown by `importtime-output-wrapper -c 'import daft' --format waterfall --depth 25`. The list of newly lazily imported modules are: - `daft.unity_catalog` - `fsspec` - `numpy` - `pandas` - `PIL.Image` - `pyarrow` - `pyarrow.csv` - `pyarrow.dataset` - `pyarrow.fs` - `pyarrow.json` - `pyarrow.parquet` - `ray` - `ray.data.extensions` - `xml.etree.ElementTree` Uses #2836 in order to defer the import of `pyarrow`. Additionally, we move all type-checking-only module imports into type checking blocks. With these changes, import times go from roughly 0.6-0.7s to ~0.045s (~13-15x faster). --------- Co-authored-by: Sammy Sidhu <[email protected]>
- Loading branch information
1 parent
dba931f
commit 78a92a2
Showing
52 changed files
with
401 additions
and
311 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,13 @@ | ||
extend = "../.ruff.toml" | ||
|
||
[lint] | ||
extend-select = [ | ||
"TID253", # banned-module-level-imports, derived from flake8-tidy-imports | ||
"TCH" # flake8-type-checking | ||
] | ||
|
||
[lint.flake8-tidy-imports] | ||
# Ban certain modules from being imported at module level, instead requiring | ||
# that they're imported lazily (e.g., within a function definition, | ||
# with daft.lazy_import.LazyImport, or with TYPE_CHECKING). | ||
banned-module-level-imports = ["daft.unity_catalog", "fsspec", "numpy", "pandas", "PIL", "pyarrow", "ray", "xml"] |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,32 @@ | ||
from typing import TYPE_CHECKING | ||
|
||
from daft.lazy_import import LazyImport | ||
|
||
if TYPE_CHECKING: | ||
import xml.etree.ElementTree as ET | ||
|
||
import fsspec | ||
import numpy as np | ||
import pandas as pd | ||
import PIL.Image as pil_image | ||
import pyarrow as pa | ||
import pyarrow.csv as pacsv | ||
import pyarrow.dataset as pads | ||
import pyarrow.fs as pafs | ||
import pyarrow.json as pajson | ||
import pyarrow.parquet as pq | ||
else: | ||
ET = LazyImport("xml.etree.ElementTree") | ||
|
||
fsspec = LazyImport("fsspec") | ||
np = LazyImport("numpy") | ||
pd = LazyImport("pandas") | ||
pil_image = LazyImport("PIL.Image") | ||
pa = LazyImport("pyarrow") | ||
pacsv = LazyImport("pyarrow.csv") | ||
pads = LazyImport("pyarrow.dataset") | ||
pafs = LazyImport("pyarrow.fs") | ||
pajson = LazyImport("pyarrow.json") | ||
pq = LazyImport("pyarrow.parquet") | ||
|
||
unity_catalog = LazyImport("daft.unity_catalog") |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.