You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Add ~kartothek.io.dask.dataframe.store_dataset_from_ddf{.interpreted-text
role="func"} to offer write support of a dask dataframe without
update support. This forbids or explicitly allows overwrites and
does not update existing datasets.
The sort_partitions_by feature now supports multiple columns.
While this has only marginal effect for predicate pushdown, it may
be used to improve the parquet compression.
build_cube_from_dataframe now supports the shuffle methods
offered by ~kartothek.io.dask.dataframe.store_dataset_from_ddf{.interpreted-text
role="func"} and ~kartothek.io.dask.dataframe.update_dataset_from_ddf{.interpreted-text
role="func"} but writes the output in the cube format
Improvements
Reduce memory consumption during index write.
Allow [simplekv]{.title-ref} stores and [storefact]{.title-ref} URLs
to be passed explicitly as input for the [store]{.title-ref}
arguments