Releases · NVIDIA-Merlin/core

29 Aug 16:27

v23.08.00

b2af60d

v23.08.00 Latest

Latest

Merge `InferenceNode` functionality from Systems into the base `Node`…

Assets 3

31 May 14:39

github-actions

v23.05.00

6fc7877

v23.05.00

What’s Changed

⚠ Breaking Changes

Adjust the DaskExecutor API methods to take Datasets instead of ddfs @karlhigley (#299)

🐜 Bug Fixes

Add some additional mutually exclusive tags to the collisions list @karlhigley (#316)
Fix Pandas extension dtype mapping for newer versions of Pandas @karlhigley (#314)
Provide better string alias support for dtypes, allow external types to resolve to unknown, fix cuDF struct support @karlhigley (#313)
Make ColumnSelector.all a property instead of a manually set attribute @karlhigley (#296)

🚀 Features

Add Rename to the set of core DAG ops that work in all DAGs @karlhigley (#312)
Enable Compound Tag Selection and Removal to work with atomic tags and strings @oliverholworthy (#317)
Add optional schema parameter to from_df method on TensorTable @oliverholworthy (#286)
Add as_tensor_type method to TensorTable for framework column conversion @oliverholworthy (#285)
Add support for cuDF's struct dtype @karlhigley (#309)

📄 Documentation

Skip errors if branch tracking fails in docs-sched-rebuild @oliverholworthy (#327)
Pin numpy version for docs build to ensure we can build the API docs for recent versions @oliverholworthy (#326)
Create stable branch locally in docs-sched-rebuild to enable stable docs build @oliverholworthy (#325)
Build docs for stable branch and make default @oliverholworthy (#322)

🔧 Maintenance

remove cupy-cuda11x from tox test environment @nv-alaiacano (#323)
Handle schema inference in Dataset with empty list col @oliverholworthy (#319)
Convert data formats before executing each op in LocalExecutor @karlhigley (#280)
Add problem matcher for actionlint to annotate errors @oliverholworthy (#315)
Add actionlint to pre-commit-config to check for valid GitHub Workflow config @oliverholworthy (#290)
Remove optional dependencies from Conda Recipe @oliverholworthy (#298)
Remove warning about compound tags deprecation @oliverholworthy (#256)
Add workflows to check base branch and set stable branch @oliverholworthy (#310)
Update tag pattern in GitHub Workflows @oliverholworthy (#311)
Skip package release jobs for dev tags @oliverholworthy (#305)
Remove use of deprecated numpy aliases of builtin types @oliverholworthy (#308)
don't re-run tests on closed PR @nv-alaiacano (#307)
Revert "Adjust the DaskExecutor API methods to take Datasets inst… @karlhigley (#306)
Update packages workflow, separating PyPI from conda build @oliverholworthy (#300)
Move build-docs to separate job in packages workflow with Python 3.9 @oliverholworthy (#302)
Add Workflow to update the stable branch ref to the latest tag @oliverholworthy (#303)
Adjust the DaskExecutor API methods to take Datasets instead of ddfs @karlhigley (#299)
CI: add quotes to workflow name @nv-alaiacano (#295)

Contributors

karlhigley, oliverholworthy, and nv-alaiacano

Assets 3

26 Apr 16:42

github-actions

v23.04.00

9d9b5c6

v23.04.00

What’s Changed

⚠️ Breaking Changes

Preserve original Dask partitions by default in [Dataset.to](http://dataset.to/)_parquet @rjzamora (#254)
Change the location and filename of schema.pbtxt to .merlin/schema.json @edknv (#249)

🐜 Bug Fixes

Return a dataframe type that matches reader passed to fetch_table_data @oliverholworthy (#287)
add hack to handle tf not recognizing bool dtype in dlpack @jperez999 (#276)
update numpy version to handle dlpack @jperez999 (#275)
fix cuda import logic from numba and device memsize @jperez999 (#274)
change cpu conversion for tf to convert-to-tensor @jperez999 (#271)
fix gpu numpy conversion offsets @jperez999 (#269)
Disable strict dtype checking by default @karlhigley (#268)
Propagate _unsafe flag through column constructors properly @karlhigley (#264)
Propagate the _unsafe mode flag from TensorTable to TensorColumn @karlhigley (#260)
add import pytest to file @jperez999 (#229)

🚀 Features

Add column_type property to TensorTable @karlhigley (#283)
Extend mapping of nullable types for pandas @oliverholworthy (#278)
add 3d tensor support to creating tensor columns @jperez999 (#246)
Run with import without gpu @jperez999 (#261)
Check environment supports target device in Dataset constructor @oliverholworthy (#243)
Support Dataset cpu-mode in environment with GPUs that have not been detected @oliverholworthy (#236)
Allow casting a Dimension to an integer when min and max are the same @karlhigley (#252)
Add predicate function argument to select_by_tag @oliverholworthy (#94)
Add row_group_size argument to Dataset.to_parquet @rjzamora (#218)
Enable Schema selection using select_by_tag with string representation of Tags enum. @oliverholworthy (#242)
Add Schema copy method @oliverholworthy (#240)

🔧 Maintenance

Update pull_apart_list to use pd.concat instead of deprecated Series.append @oliverholworthy (#291)
Install protobuf version compatible with tensorflow 2.9 for Merlin Models tests @oliverholworthy (#289)
Add support for from_dlpack with numpy 1.23.0 @oliverholworthy (#284)
Save schema in old location for backwards compatibility @oliverholworthy (#267)
Refactor LocalExecutor into more discrete steps that can be overridden @karlhigley (#279)
Preserve type of shape dims as ints when re-loading schema from disk @oliverholworthy (#281)
uses compat everywhere to allow container bypass when gpus not present @jperez999 (#277)
update numpy version to handle dlpack @jperez999 (#275)
fix cuda import logic from numba and device memsize @jperez999 (#274)
migrate compat into a separate folder and separate tf and torch import @jperez999 (#272)
change cpu conversion for tf to convert-to-tensor @jperez999 (#271)
compat imports update @jperez999 (#270)
fix gpu numpy conversion offsets @jperez999 (#269)
fix configure tf function to id all gpus available @jperez999 (#266)
migrate configure tensorflow to core, separate has_gpu from compat @jperez999 (#265)
add 3d tensor support to creating tensor columns @jperez999 (#246)
Revert #261 and #262 (merlin.core.compat changes) @karlhigley (#263)
Run with import without gpu @jperez999 (#261)
Update merlin.core.compat to use HAS_GPU and add add'l libraries @karlhigley (#262)
Rework DLpack conversion dispatching to allow caching dispatched methods @karlhigley (#259)
Add an unsafe mode to TensorTable/TensorColumn (for internal use) @karlhigley (#258)
Make TensorColumn shape and dtype properties lazy but memoized @karlhigley (#257)
Bump dask, distributed, fsspec versions @karlhigley (#201)
Move common steps to run tox env into reusable workflow @oliverholworthy (#247)
Improve check for array types in is_list_dtype @oliverholworthy (#253)
Support cupy and numpy array types in flatten_list_column_values @oliverholworthy (#251)
Update is_list_dtype to handle additional types @oliverholworthy (#250)
Remove use of HAS_GPU from dispatch functions @oliverholworthy (#244)
Change the location and filename of schema.pbtxt to .merlin/schema.json @edknv (#249)
Add workflow for testing dataloader @oliverholworthy (#186)
add import pytest to file @jperez999 (#229)
Add correct job dependency for release in cpu-packages @oliverholworthy (#241)

Contributors

karlhigley, oliverholworthy, and 3 other contributors

Assets 3

10 Mar 09:50

github-actions

v23.02.01

6d1e720

v23.02.01

What's Changed

Patch release on top of v23.02.00

🔧 Maintenance

Add pynvml dependency @oliverholworthy (#237)

Full Changelog: v23.02.00...v23.02.01

Contributors

oliverholworthy

Assets 3

08 Mar 16:32

github-actions

v23.02.00

a824ab7

v23.02.00

What’s Changed

⚠ Breaking Changes

Remove use of is_list/is_ragged and replace with setting shapes @karlhigley (#215)
Add a new shape field to ColumnSchema @karlhigley (#195)

🐜 Bug Fixes

Save schema with consistent dtype when dtypes is used @oliverholworthy (#182)

🚀 Features

Update HAS_GPU variable to account for CUDA_VISIBLE_DEVICES @oliverholworthy (#221)
Clean up of make_df function @jperez999 (#205)
separate cupy import from rapids @jperez999 (#211)
Support partially specified value_count when used with is_ragged=False @oliverholworthy (#213)
Fix for updated versions of cudf to parquet @jperez999 (#204)
Create standard Merlin dtypes in the merlin.dtypes module @karlhigley (#170)

🔧 Maintenance

Remove use of is_list/is_ragged and replace with setting shapes @karlhigley (#215)
Reduce the overhead of using LocalExecutor (esp. dtype validation) @karlhigley (#219)
Clean up of make_df function @jperez999 (#205)
Add util functions for un/grouping column values/offsets in dicts @karlhigley (#216)
Fill in some missing docstrings @karlhigley (#217)
Serialize shapes to and from Merlin schema files @karlhigley (#214)
Fix for updated versions of cudf to parquet @jperez999 (#204)
add gcp label to jenkinsfile @AyodeAwe (#181)
Add a new shape field to ColumnSchema @karlhigley (#195)
Increase upper bound of pandas version from 1.4 to 1.6 @oliverholworthy (#210)
Update pre-commit config with latest versions of repos @oliverholworthy (#208)
Install latest version of NVTabular/dataloader with systems tests @oliverholworthy (#209)
Add note on why we're using device_get_count instead of cuda.gpus @oliverholworthy (#207)
Add Formatter (Prettier) for YAML and Markdown files @karlhigley (#199)
Change the name of the package building action @karlhigley (#198)
Split CPU tests and building packages for release into separate actions @karlhigley (#197)
Simplify ColumnSchema.with methods using dataclasses.replace() @karlhigley (#194)
Handle executor transform case when parent node provides no new columns @oliverholworthy (#226)
Update Models/NVTabular test config @oliverholworthy (#185)
skip notebook tests in models test @edknv (#193)
add a build pandas column api for easier multihot column creation @jperez999 (#183)
Use pre-commit for linting in GitHub Actions Workflow @oliverholworthy (#184)
Convert to cudf.Series in create_multihot_col @oliverholworthy (#187)
adding workflow for GPU CI on gha @jperez999 (#191)

Contributors

karlhigley, oliverholworthy, and 3 other contributors

Assets 2

30 Dec 19:41

github-actions

v0.10.0

2fc6889

v0.10.0 (22.12)

What’s Changed

🐜 Bug Fixes

Fix file-count warning in Dataset.to_parquet @rjzamora (#159)
Remove the @Property annotation from Transformable.columns @karlhigley (#166)
Update value_count serialization/deserialization to be consistent with original schema @oliverholworthy (#111)
Fix feature.shape attribute in from_merlin_schema @rjzamora (#169)
Add the schema to the output of the .repartition() method @sararb (#192)

🚀 Features

Read parquet statistics to optimize len when they are missing @rjzamora (#178)
Change is_ragged property based on value_count in with_properties @oliverholworthy (#172)
add is_list detection for merlin columns @jperez999 (#180)
Enable partial value count to be specified @oliverholworthy (#171)

📄 Documentation

docs: Add temp semver to calver banner @mikemckiernan (#161)

🔧 Maintenance

Remove specifying is_ragged in LocalExecutor _transform_data @oliverholworthy (#173)
Add Jenkinsfile @AyodeAwe (#167)
Fix concat_columns for DataFrames with list features @oliverholworthy (#165)
update drafter to work on tags & update cpu ci to target branches @jperez999 (#174)
Remove explicit DictArray reference from merlin.core.dispatch @karlhigley (#163)

Contributors

karlhigley, oliverholworthy, and 6 other contributors

Assets 3

22 Nov 19:33

github-actions

v0.9.0

64755ba

v0.9.0

What’s Changed

🐜 Bug Fixes

Update with_properties to enable changing existing properties on ColumnSchema @oliverholworthy (#157)
Patch is_list_dtype/list_val_dtype to work with Numpy ndarrays @karlhigley (#153)
Fix dtype inference from pandas list column @rjzamora (#154)

🚀 Features

necessary changes to allow graph execution in dataloader @jperez999 (#152)

📄 Documentation

docs: Add basic SEO configuration @mikemckiernan (#160)

🔧 Maintenance

Rework executor transform methods to accept a Graph @karlhigley (#158)
Update with_properties to enable changing existing properties on ColumnSchema @oliverholworthy (#157)
Serialize/Deserialize ColumnSchema consistently when the domain name matches the feature name @oliverholworthy (#155)

Contributors

karlhigley, oliverholworthy, and 3 other contributors

Assets 3

24 Oct 18:05

github-actions

v0.8.0

14a18dc

v0.8.0

What’s Changed

🚀 Features

Add wildcard selector for cases where you'd like to select all columns by @karlhigley in #143

🐜 Bug Fixes

Avoid using numba to set device context in import by @rjzamora in #145
Fix ambigous statement when names is a list by @jperez999 in #147
Resolve wildcard selectors in BaseOperator.compute_selector() by @karlhigley in #146

🔧 Maintenance

Break LocalExecutor.transform() down into smaller methods by @karlhigley in #140
Add and apply DictArray wrapper class and corresponding Protocol definitions by @karlhigley in #141
Specify minimum Python version as 3.8 in setup.py by @oliverholworthy in #151
Add a validate_schemas hook to clean up downstream validation code by @karlhigley in #76
Add XGBoost to merlin-models CPU tests by @karlhigley in #131

Contributors

karlhigley, oliverholworthy, and 2 other contributors

Assets 3

26 Sep 17:58

github-actions

v0.7.0

5926fcf

v0.7.0

What’s Changed

🔧 Maintenance

Switch downstream repo tests from build to check (to make optional) @karlhigley (#137)

Contributors

karlhigley

Assets 3

07 Sep 16:35

github-actions

v0.6.0

b78f7f0

v0.6.0

What’s Changed

⚠ Breaking Changes

fix pull apart list for newer cudf versions @jperez999 (#122)

🐜 Bug Fixes

ensure that combinations of nodes can be used as subgraphs @nv-alaiacano (#130)
Set HAS_GPU = False in dispatch if relevant packages fail to import @oliverholworthy (#112)
remove upstream dependencies that have no outputs @nv-alaiacano (#107)

🚀 Features

add subgraph feature of a Graph @nv-alaiacano (#128)
Split compound tags (like USER_ID) into atomic tags (like USER,ID) @karlhigley (#119)
Add a quantity attribute to ColumnSchema @karlhigley (#118)

🔧 Maintenance

Fix versioneer to get accurate version numbers @benfred (#132)
Combine changes that address downstream failures @karlhigley (#136)
Expand models testing in PR checks to include TF and other frameworks @karlhigley (#129)
Migrate Merlin DAG executors from NVTabular @karlhigley (#125)
Improve the organization of the schema tests (column vs schema vs io) @karlhigley (#124)
Split the downstream repo tests into separate Tox environments and Github actions @karlhigley (#127)
Migrate test environment to tox @karlhigley (#126)
fix pull apart list for newer cudf versions @jperez999 (#122)
Use mambabuild for generating conda package in github actions @benfred (#116)
Split compound tags (like USER_ID) into atomic tags (like USER,ID) @karlhigley (#119)
Auto-update pre-commit hook packages @karlhigley (#117)
Update versioneer from 0.21 to 0.23 @oliverholworthy (#114)
Pin fsspec==2022.5.0 @karlhigley (#113)

Contributors

benfred, karlhigley, and 3 other contributors

Assets 3

Releases: NVIDIA-Merlin/core

v23.08.00

v23.05.00

What’s Changed

⚠ Breaking Changes

🐜 Bug Fixes

🚀 Features

📄 Documentation

🔧 Maintenance

Contributors

v23.04.00

What’s Changed

⚠️ Breaking Changes

🐜 Bug Fixes

🚀 Features

🔧 Maintenance

Contributors

v23.02.01

What's Changed

🔧 Maintenance

Contributors

v23.02.00

What’s Changed

⚠ Breaking Changes

🐜 Bug Fixes

🚀 Features

🔧 Maintenance

Contributors

v0.10.0 (22.12)

What’s Changed

🐜 Bug Fixes

🚀 Features

📄 Documentation

🔧 Maintenance

Contributors

v0.9.0

What’s Changed

🐜 Bug Fixes

🚀 Features

📄 Documentation

🔧 Maintenance

Contributors

v0.8.0

What’s Changed

🚀 Features

🐜 Bug Fixes

🔧 Maintenance

Contributors

v0.7.0

What’s Changed

🔧 Maintenance

Contributors

v0.6.0

What’s Changed

⚠ Breaking Changes

🐜 Bug Fixes

🚀 Features

🔧 Maintenance

Contributors