Skip to content

Kartothek v3.15.0

Compare
Choose a tag to compare
@github-actions github-actions released this 28 Sep 14:48

Version 3.15.0 (2020-09-28)

New functionality

  • Add
    ~kartothek.io.dask.dataframe.store_dataset_from_ddf{.interpreted-text
    role="func"} to offer write support of a dask dataframe without
    update support. This forbids or explicitly allows overwrites and
    does not update existing datasets.
  • The sort_partitions_by feature now supports multiple columns.
    While this has only marginal effect for predicate pushdown, it may
    be used to improve the parquet compression.
  • build_cube_from_dataframe now supports the shuffle methods
    offered by
    ~kartothek.io.dask.dataframe.store_dataset_from_ddf{.interpreted-text
    role="func"} and
    ~kartothek.io.dask.dataframe.update_dataset_from_ddf{.interpreted-text
    role="func"} but writes the output in the cube format

Improvements

  • Reduce memory consumption during index write.
  • Allow [simplekv]{.title-ref} stores and [storefact]{.title-ref} URLs
    to be passed explicitly as input for the [store]{.title-ref}
    arguments