5.1.0
This release brings SSH tunnel connection recovery to Redshift Loader. Also, it makes disabling in-batch natural deduplication in Batch Transformer possible.
Option to disable in-batch natural deduplication in Batch Transformer
Previously, it wasn't possible to disable in-batch natural deduplication in Batch Transformer. We have found that in-batch natural deduplication affects performance therefore we have made disabling it possible. If duplicate events aren't a problem for you, we suggest disabling deduplication.
It can be disabled by adding following section to the config:
"deduplication": {
# When natural deduplication is disabled, 'synthetic' deduplication needs to be disabled too.
"synthetic": {
"type": "NONE"
}
"natural": false
}
More information about deduplication in Batch Transformer can be found here.
SSH tunnel connection recovery in Redshift Loader
Redshift loader can connect to a private Redshift cluster through an SSH tunnel. Previously, if SSH tunnel session was disconnected, the loader didn't have a way to discover it. We added retry around SSH tunnel connection to make it possible to recover from this problem and to make it more robust.
Upgrading to 5.1.0
If you are already using a recent version of RDB Loader (3.0.0 or higher) then upgrading to 5.1.0 is as simple as pulling the newest docker images. There are no changes needed to your configuration files.
docker pull snowplow/transformer-kinesis:5.1.0
docker pull snowplow/rdb-loader-redshift:5.1.0
docker pull snowplow/rdb-loader-snowflake:5.1.0
docker pull snowplow/rdb-loader-databricks:5.1.0
The Snowplow docs site has a full guide to running the RDB Loader.