Skip to content

Commit

Permalink
Adding multiple unique keys (#6438)
Browse files Browse the repository at this point in the history
Closes #6343

## What are you changing in this pull request and why?

Adds multiple unique keys as outline in [this
plan](#6343 (comment)).

## Checklist
- [ ] I have reviewed the [Content style
guide](https://github.com/dbt-labs/docs.getdbt.com/blob/current/contributing/content-style-guide.md)
so my content adheres to these guidelines.
- [ ] The topic I'm writing about is for specific dbt version(s) and I
have versioned it according to the [version a whole
page](https://github.com/dbt-labs/docs.getdbt.com/blob/current/contributing/single-sourcing-content.md#adding-a-new-version)
and/or [version a block of
content](https://github.com/dbt-labs/docs.getdbt.com/blob/current/contributing/single-sourcing-content.md#versioning-blocks-of-content)
guidelines.
- [ ] I have added checklist item(s) to this list for anything anything
that needs to happen before this PR is merged, such as "needs technical
review" or "change base branch."
- [ ] The content in this PR requires a dbt release note, so I added one
to the [release notes
page](https://docs.getdbt.com/docs/dbt-versions/dbt-cloud-release-notes).
<!--
PRE-RELEASE VERSION OF dbt (if so, uncomment):
- [ ] Add a note to the prerelease version [Migration
Guide](https://github.com/dbt-labs/docs.getdbt.com/tree/current/website/docs/docs/dbt-versions/core-upgrade)
-->
<!-- 
ADDING OR REMOVING PAGES (if so, uncomment):
- [ ] Add/remove page in `website/sidebars.js`
- [ ] Provide a unique filename for new pages
- [ ] Add an entry for deleted pages in `website/vercel.json`
- [ ] Run link testing locally with `npm run build` to update the links
that point to deleted pages
-->

<!-- vercel-deployment-preview -->
---
🚀 Deployment available! Here are the direct links to the updated files:


-
https://docs-getdbt-com-git-mult-unique-keys-dbt-labs.vercel.app/docs/build/snapshots
-
https://docs-getdbt-com-git-mult-unique-keys-dbt-labs.vercel.app/reference/resource-configs/unique_key

<!-- end-vercel-deployment-preview -->

---------

Co-authored-by: Mirna Wong <[email protected]>
  • Loading branch information
runleonarun and mirnawong1 authored Nov 14, 2024
1 parent bfbbc1f commit 988f1b7
Show file tree
Hide file tree
Showing 2 changed files with 19 additions and 30 deletions.
2 changes: 1 addition & 1 deletion website/docs/docs/build/snapshots.md
Original file line number Diff line number Diff line change
Expand Up @@ -83,7 +83,7 @@ The following table outlines the configurations available for snapshots:
| [schema](/reference/resource-configs/schema) | Specify a custom schema for the snapshot | No | snapshots |
| [alias](/reference/resource-configs/alias) | Specify an alias for the snapshot | No | your_custom_snapshot |
| [strategy](/reference/resource-configs/strategy) | The snapshot strategy to use. Valid values: `timestamp` or `check` | Yes | timestamp |
| [unique_key](/reference/resource-configs/unique_key) | A <Term id="primary-key" /> column or expression for the record | Yes | id |
| [unique_key](/reference/resource-configs/unique_key) | A <Term id="primary-key" /> column(s) (string or array) or expression for the record | Yes | `id` or `[order_id, product_id]` |
| [check_cols](/reference/resource-configs/check_cols) | If using the `check` strategy, then the columns to check | Only if using the `check` strategy | ["status"] |
| [updated_at](/reference/resource-configs/updated_at) | If using the `timestamp` strategy, the timestamp column to compare | Only if using the `timestamp` strategy | updated_at |
| [invalidate_hard_deletes](/reference/resource-configs/invalidate_hard_deletes) | Find hard deleted records in source and set `dbt_valid_to` to current time if the record no longer exists | No | True |
Expand Down
47 changes: 18 additions & 29 deletions website/docs/reference/resource-configs/unique_key.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
---
resource_types: [snapshots]
description: "Unique_key - Read this in-depth guide to learn about configurations in dbt."
description: "Learn more about unique_key configurations in dbt."
datatype: column_name_or_expression
---

Expand All @@ -14,7 +14,7 @@ snapshots:
- name: orders_snapshot
relation: source('my_source', 'my_table')
[config](/reference/snapshot-configs):
unique_key: id
unique_key: order_id

```

Expand Down Expand Up @@ -52,7 +52,7 @@ snapshots:
## Description
A column name or expression that is unique for the inputs of a snapshot. dbt uses this to match records between a result set and an existing snapshot, so that changes can be captured correctly.

In Versionless and dbt v1.9 and later, [snapshots](/docs/build/snapshots) are defined and configured in YAML files within your `snapshots/` directory. The `unique_key` is specified within the `config` block of your snapshot YAML file.
In Versionless and dbt v1.9 and later, [snapshots](/docs/build/snapshots) are defined and configured in YAML files within your `snapshots/` directory. You can specify one or multiple `unique_key` values within your snapshot YAML file's `config` key.

:::caution

Expand Down Expand Up @@ -114,29 +114,37 @@ snapshots:

</File>

### Use a combination of two columns as a unique key
This configuration accepts a valid column expression. As such, you can concatenate two columns together as a unique key if required. It's a good idea to use a separator (e.g. `'-'`) to ensure uniqueness.

<VersionBlock firstVersion="1.9">

### Use multiple unique keys

You can configure snapshots to use multiple unique keys for `primary_key` columns.

<File name='snapshots/transaction_items_snapshot.yml'>

```yaml
snapshots:
- name: transaction_items_snapshot
relation: source('erp', 'transactions')
- name: orders_snapshot
relation: source('jaffle_shop', 'orders')
config:
schema: snapshots
unique_key: "transaction_id || '-' || line_item_id"
unique_key:
- order_id
- product_id
strategy: timestamp
updated_at: updated_at

```
</File>
</VersionBlock>
<VersionBlock lastVersion="1.8">
### Use a combination of two columns as a unique key
This configuration accepts a valid column expression. As such, you can concatenate two columns together as a unique key if required. It's a good idea to use a separator (for example, `'-'`) to ensure uniqueness.
<File name='snapshots/transaction_items_snapshot.sql'>
```jinja2
Expand All @@ -159,25 +167,9 @@ from {{ source('erp', 'transactions') }}
```

</File>
</VersionBlock>

Though, it's probably a better idea to construct this column in your query and use that as the `unique_key`:

<VersionBlock firstVersion="1.9">

<File name='snapshots/transaction_items_snapshot.yml'>

```yaml
snapshots:
- name: transaction_items_snapshot
relation: {{ ref('transaction_items_ephemeral') }}
config:
schema: snapshots
unique_key: id
strategy: timestamp
updated_at: updated_at
```
</File>

<File name='models/transaction_items_ephemeral.sql'>

Expand All @@ -195,9 +187,6 @@ from {{ source('erp', 'transactions') }}

In this example, we create an ephemeral model `transaction_items_ephemeral` that creates an `id` column that can be used as the `unique_key` our snapshot configuration.

</VersionBlock>

<VersionBlock lastVersion="1.8">
<File name='snapshots/transaction_items_snapshot.sql'>

```jinja2
Expand Down

0 comments on commit 988f1b7

Please sign in to comment.