Skip to content

Navigation Menu

Explore
By company size
By use case
By industry
View all solutions
Topics
- AI
- DevOps
- Security
- Software Development
- View all
Explore
- GitHub Sponsors
  Fund open source developers
- The ReadME Project
  GitHub community articles
Repositories
- Enterprise platform
  AI-powered developer platform
Available add-ons
Pricing

Search code, repositories, users, issues, pull requests...

Search

Clear

Search syntax tips

Provide feedback

We read every piece of feedback, and take your input very seriously.

Include my email address so I can be contacted

Saved searches

Use saved searches to filter your results more quickly

Name

Query

To see all available qualifiers, see our documentation.

You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session. You switched accounts on another tab or window. Reload to refresh your session.

Dismiss alert

JDASoftwareGroup / kartothek Public

Notifications You must be signed in to change notification settings
Fork 53
Star 161

Code
Issues 57
Pull requests 20
Actions
Security
Insights

Additional navigation options

Code
Issues
Pull requests
Actions
Security
Insights

Releases: JDASoftwareGroup/kartothek

Releases · JDASoftwareGroup/kartothek

Kartothek v3.19.0

12 Feb 10:10

Compare

Choose a tag to compare

Loading

Kartothek v3.19.0

Version 3.19.0 (2021-02-12)

Fix an issue where updates on cubes or updates on datatsets using
dask.dataframe might not update all secondary indices, resulting in
a corrupt state after the update
Expose compression type and row group chunk size in Cube interface
via optional parameter of type
~kartothek.serialization.ParquetSerializer{.interpreted-text
role="class"}.
Add retries to
~kartothek.serialization._parquet.ParquetSerializer.restore_dataframe{.interpreted-text
role="func"} IOErrors on long running ktk + dask tasks have been
observed. Until the root cause is fixed, the serialization is
retried to gain more stability.

Assets 2

Loading

All reactions

Kartothek v3.18.0

25 Jan 10:57

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.

GPG key ID: 4AEE18F83AFDEB23

Expired

Learn about vigilant mode.

Compare

Choose a tag to compare

Loading

Kartothek v3.18.0

Version 3.18.0 (2021-01-25)

Add cube.suppress_index_on to switch off the default index
creation for dimension columns
Fixed the import issue of zstd module for [kartothek.core
_zmsgpack]{.title-ref}.
Fix a bug in
~kartothek.io_components.read.dispatch_metapartitions_from_factory{.interpreted-text
role="func"} where [dispatch_by=[]]{.title-ref} would be treated
like [dispatch_by=None]{.title-ref}, not merging all dataset
partitions into a single partitions.

Assets 2

Loading

All reactions

Kartothek v3.17.3

04 Dec 10:22

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.

GPG key ID: 4AEE18F83AFDEB23

Expired

Learn about vigilant mode.

Compare

Choose a tag to compare

Loading

Kartothek v3.17.3

Version 3.17.3 (2010-12-04)

Allow pyarrow==2 as a dependency.

Assets 2

Loading

All reactions

Kartothek v3.17.2

01 Dec 14:57

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.

GPG key ID: 4AEE18F83AFDEB23

Expired

Learn about vigilant mode.

Compare

Choose a tag to compare

Loading

Kartothek v3.17.2

Version 3.17.2 (2020-12-01)

#378 Improve logging information for potential buffer serialization
errors

Assets 2

Loading

All reactions

Kartothek v3.17.1

24 Nov 15:51

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.

GPG key ID: 4AEE18F83AFDEB23

Expired

Learn about vigilant mode.

Compare

Choose a tag to compare

Loading

Kartothek v3.17.1

Version 3.17.1 (2020-11-24)

Bugfixes

Fix GitHub #375 by loosening checks of the supplied store argument

Assets 2

Loading

All reactions

Kartothek v3.17.0

23 Nov 09:23

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.

GPG key ID: 4AEE18F83AFDEB23

Expired

Learn about vigilant mode.

Compare

Choose a tag to compare

Loading

Kartothek v3.17.0

Version 3.17.0 (2020-11-23)

Improvements

Improve performance for "in" predicate literals using long object
lists as values
~kartothek.io.eager.commit_dataset{.interpreted-text role="func"}
now allows to modify the user metadata without adding new data.

Bugfixes

Fix an issue where
~kartothek.io.dask.dataframe.collect_dataset_metadata{.interpreted-text
role="func"} would return improper rowgroup statistics
Fix an issue where
~kartothek.io.dask.dataframe.collect_dataset_metadata{.interpreted-text
role="func"} would execute get_parquet_metadata at graph
construction time
Fix a bug in
kartothek.io.eager_cube.remove_partitions{.interpreted-text
role="func"} where all partitions were removed instead of non at
all.
Fix a bug in
~kartothek.core.dataset.DatasetMetadataBase.get_indices_as_dataframe{.interpreted-text
role="meth"} which would raise an IndexError if indices were empty
or had not been loaded

Assets 2

Loading

All reactions

Kartothek v3.16.0

29 Sep 15:04

Compare

Choose a tag to compare

Loading

Kartothek v3.16.0

Version 3.16.0 (2020-09-29)

New functionality

Allow filtering of nans using "==", "!=" and "in" operators

Bugfixes

Fix a regression which would not allow the usage of non serializable
stores even when using factories

Assets 2

Loading

All reactions

Kartothek 3.15.1

28 Sep 16:06

fjetter

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.

GPG key ID: 4AEE18F83AFDEB23

Expired

Learn about vigilant mode.

Compare

Choose a tag to compare

Loading

Kartothek 3.15.1

Version 3.15.1 (2020-09-28)

Note: Identical to 3.15.0 but with a fix in packaging

New functionality

Add
~kartothek.io.dask.dataframe.store_dataset_from_ddf{.interpreted-text
role="func"} to offer write support of a dask dataframe without
update support. This forbids or explicitly allows overwrites and
does not update existing datasets.
The sort_partitions_by feature now supports multiple columns.
While this has only marginal effect for predicate pushdown, it may
be used to improve the parquet compression.
build_cube_from_dataframe now supports the shuffle methods
offered by
~kartothek.io.dask.dataframe.store_dataset_from_ddf{.interpreted-text
role="func"} and
~kartothek.io.dask.dataframe.update_dataset_from_ddf{.interpreted-text
role="func"} but writes the output in the cube format

Improvements

Reduce memory consumption during index write.
Allow [simplekv]{.title-ref} stores and [storefact]{.title-ref} URLs
to be passed explicitly as input for the [store]{.title-ref}
arguments

Assets 2

Loading

All reactions

Kartothek v3.15.0

28 Sep 14:48

Compare

Choose a tag to compare

Loading

Kartothek v3.15.0

Version 3.15.0 (2020-09-28)

New functionality

Add
~kartothek.io.dask.dataframe.store_dataset_from_ddf{.interpreted-text
role="func"} to offer write support of a dask dataframe without
update support. This forbids or explicitly allows overwrites and
does not update existing datasets.
The sort_partitions_by feature now supports multiple columns.
While this has only marginal effect for predicate pushdown, it may
be used to improve the parquet compression.
build_cube_from_dataframe now supports the shuffle methods
offered by
~kartothek.io.dask.dataframe.store_dataset_from_ddf{.interpreted-text
role="func"} and
~kartothek.io.dask.dataframe.update_dataset_from_ddf{.interpreted-text
role="func"} but writes the output in the cube format

Improvements

Reduce memory consumption during index write.
Allow [simplekv]{.title-ref} stores and [storefact]{.title-ref} URLs
to be passed explicitly as input for the [store]{.title-ref}
arguments

Assets 2

Loading

All reactions

Kartothek 3.14.0

27 Aug 09:31

fjetter

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.

GPG key ID: 4AEE18F83AFDEB23

Expired

Learn about vigilant mode.

Compare

Choose a tag to compare

Loading

Kartothek 3.14.0

Version 3.14.0 (2020-08-27)

New functionality

Add hash_dataset functionality

Improvements

Expand pandas version pin to include 1.1.X
Expand pyarrow version pin to include 1.x
Large addition to documentation for multi dataset handling
(Kartothek Cubes)

Assets 2

Loading

All reactions

Previous 1 2 3 4 5 Next

Footer

© 2024 GitHub, Inc.

Footer navigation

Terms
Privacy
Security
Status
Docs
Contact

You can’t perform that action at this time.