Skip to content

Releases: JDASoftwareGroup/kartothek

Kartothek v3.19.0

12 Feb 10:10
Choose a tag to compare

Version 3.19.0 (2021-02-12)

  • Fix an issue where updates on cubes or updates on datatsets using
    dask.dataframe might not update all secondary indices, resulting in
    a corrupt state after the update
  • Expose compression type and row group chunk size in Cube interface
    via optional parameter of type
  • Add retries to
    role="func"} IOErrors on long running ktk + dask tasks have been
    observed. Until the root cause is fixed, the serialization is
    retried to gain more stability.

Kartothek v3.18.0

25 Jan 10:57
Choose a tag to compare

Version 3.18.0 (2021-01-25)

  • Add cube.suppress_index_on to switch off the default index
    creation for dimension columns
  • Fixed the import issue of zstd module for [kartothek.core
  • Fix a bug in{.interpreted-text
    role="func"} where [dispatch_by=[]]{.title-ref} would be treated
    like [dispatch_by=None]{.title-ref}, not merging all dataset
    partitions into a single partitions.

Kartothek v3.17.3

04 Dec 10:22
Choose a tag to compare

Version 3.17.3 (2010-12-04)

  • Allow pyarrow==2 as a dependency.

Kartothek v3.17.2

01 Dec 14:57
Choose a tag to compare

Version 3.17.2 (2020-12-01)

  • #378 Improve logging information for potential buffer serialization

Kartothek v3.17.1

24 Nov 15:51
Choose a tag to compare

Version 3.17.1 (2020-11-24)


  • Fix GitHub #375 by loosening checks of the supplied store argument

Kartothek v3.17.0

23 Nov 09:23
Choose a tag to compare

Version 3.17.0 (2020-11-23)


  • Improve performance for "in" predicate literals using long object
    lists as values
  •{.interpreted-text role="func"}
    now allows to modify the user metadata without adding new data.


  • Fix an issue where{.interpreted-text
    role="func"} would return improper rowgroup statistics
  • Fix an issue where{.interpreted-text
    role="func"} would execute get_parquet_metadata at graph
    construction time
  • Fix a bug in{.interpreted-text
    role="func"} where all partitions were removed instead of non at
  • Fix a bug in
    role="meth"} which would raise an IndexError if indices were empty
    or had not been loaded

Kartothek v3.16.0

29 Sep 15:04
Choose a tag to compare

Version 3.16.0 (2020-09-29)

New functionality

  • Allow filtering of nans using "==", "!=" and "in" operators


  • Fix a regression which would not allow the usage of non serializable
    stores even when using factories

Kartothek 3.15.1

28 Sep 16:06
Choose a tag to compare

Version 3.15.1 (2020-09-28)

Note: Identical to 3.15.0 but with a fix in packaging

New functionality

  • Add{.interpreted-text
    role="func"} to offer write support of a dask dataframe without
    update support. This forbids or explicitly allows overwrites and
    does not update existing datasets.
  • The sort_partitions_by feature now supports multiple columns.
    While this has only marginal effect for predicate pushdown, it may
    be used to improve the parquet compression.
  • build_cube_from_dataframe now supports the shuffle methods
    offered by{.interpreted-text
    role="func"} and{.interpreted-text
    role="func"} but writes the output in the cube format


  • Reduce memory consumption during index write.
  • Allow [simplekv]{.title-ref} stores and [storefact]{.title-ref} URLs
    to be passed explicitly as input for the [store]{.title-ref}

Kartothek v3.15.0

28 Sep 14:48
Choose a tag to compare

Version 3.15.0 (2020-09-28)

New functionality

  • Add{.interpreted-text
    role="func"} to offer write support of a dask dataframe without
    update support. This forbids or explicitly allows overwrites and
    does not update existing datasets.
  • The sort_partitions_by feature now supports multiple columns.
    While this has only marginal effect for predicate pushdown, it may
    be used to improve the parquet compression.
  • build_cube_from_dataframe now supports the shuffle methods
    offered by{.interpreted-text
    role="func"} and{.interpreted-text
    role="func"} but writes the output in the cube format


  • Reduce memory consumption during index write.
  • Allow [simplekv]{.title-ref} stores and [storefact]{.title-ref} URLs
    to be passed explicitly as input for the [store]{.title-ref}

Kartothek 3.14.0

27 Aug 09:31
Choose a tag to compare

Version 3.14.0 (2020-08-27)

New functionality

  • Add hash_dataset functionality


  • Expand pandas version pin to include 1.1.X
  • Expand pyarrow version pin to include 1.x
  • Large addition to documentation for multi dataset handling
    (Kartothek Cubes)