Kartothek 3.15.1
Version 3.15.1 (2020-09-28)
Note: Identical to 3.15.0 but with a fix in packaging
New functionality
- Add
~kartothek.io.dask.dataframe.store_dataset_from_ddf
{.interpreted-text
role="func"} to offer write support of a dask dataframe without
update support. This forbids or explicitly allows overwrites and
does not update existing datasets. - The
sort_partitions_by
feature now supports multiple columns.
While this has only marginal effect for predicate pushdown, it may
be used to improve the parquet compression. build_cube_from_dataframe
now supports theshuffle
methods
offered by
~kartothek.io.dask.dataframe.store_dataset_from_ddf
{.interpreted-text
role="func"} and
~kartothek.io.dask.dataframe.update_dataset_from_ddf
{.interpreted-text
role="func"} but writes the output in the cube format
Improvements
- Reduce memory consumption during index write.
- Allow [simplekv]{.title-ref} stores and [storefact]{.title-ref} URLs
to be passed explicitly as input for the [store]{.title-ref}
arguments