Skip to content

Kartothek 3.15.1

Compare
Choose a tag to compare
@fjetter fjetter released this 28 Sep 16:06
· 138 commits to master since this release
21ddca4

Version 3.15.1 (2020-09-28)

Note: Identical to 3.15.0 but with a fix in packaging

New functionality

  • Add
    ~kartothek.io.dask.dataframe.store_dataset_from_ddf{.interpreted-text
    role="func"} to offer write support of a dask dataframe without
    update support. This forbids or explicitly allows overwrites and
    does not update existing datasets.
  • The sort_partitions_by feature now supports multiple columns.
    While this has only marginal effect for predicate pushdown, it may
    be used to improve the parquet compression.
  • build_cube_from_dataframe now supports the shuffle methods
    offered by
    ~kartothek.io.dask.dataframe.store_dataset_from_ddf{.interpreted-text
    role="func"} and
    ~kartothek.io.dask.dataframe.update_dataset_from_ddf{.interpreted-text
    role="func"} but writes the output in the cube format

Improvements

  • Reduce memory consumption during index write.
  • Allow [simplekv]{.title-ref} stores and [storefact]{.title-ref} URLs
    to be passed explicitly as input for the [store]{.title-ref}
    arguments