Skip to content

Commit

Permalink
Merge branch 'current' into revert-6466-revert-6455-add-microbatch-flag
Browse files Browse the repository at this point in the history
  • Loading branch information
mirnawong1 authored Nov 25, 2024
2 parents 69c9d18 + 38f5adc commit 9b68f75
Show file tree
Hide file tree
Showing 14 changed files with 88 additions and 33 deletions.
4 changes: 2 additions & 2 deletions website/docs/docs/build/incremental-strategy.md
Original file line number Diff line number Diff line change
Expand Up @@ -243,11 +243,11 @@ select * from {{ ref("some_model") }}

:::note limited support

Custom strategies are not currently suppored on the BigQuery and Spark adapters.
Custom strategies are not currently supported on the BigQuery and Spark adapters.

:::

Starting from dbt version 1.2 and onwards, users have an easier alternative to [creating an entirely new materialization](/guides/create-new-materializations). They define and use their own "custom" incremental strategies by:
From dbt v1.2 and onwards, users have an easier alternative to [creating an entirely new materialization](/guides/create-new-materializations). They define and use their own "custom" incremental strategies by:

1. Defining a macro named `get_incremental_STRATEGY_sql`. Note that `STRATEGY` is a placeholder and you should replace it with the name of your custom incremental strategy.
2. Configuring `incremental_strategy: STRATEGY` within an incremental model.
Expand Down
28 changes: 28 additions & 0 deletions website/docs/docs/build/python-models.md
Original file line number Diff line number Diff line change
Expand Up @@ -598,6 +598,34 @@ Python models have capabilities that SQL models do not. They also have some draw
- **These capabilities are very new.** As data warehouses develop new features, we expect them to offer cheaper, faster, and more intuitive mechanisms for deploying Python transformations. **We reserve the right to change the underlying implementation for executing Python models in future releases.** Our commitment to you is around the code in your model `.py` files, following the documented capabilities and guidance we're providing here.
- **Lack of `print()` support.** The data platform runs and compiles your Python model without dbt's oversight. This means it doesn't display the output of commands such as Python's built-in [`print()`](https://docs.python.org/3/library/functions.html#print) function in dbt's logs.

- <Expandable alt_header="Alternatives to using print() in Python models">

The following explains other methods you can use for debugging, such as writing messages to a dataframe column:

- Using platform logs: Use your data platform's logs to debug your Python models.
- Return logs as a dataframe: Create a dataframe containing your logs and build it into the warehouse.
- Develop locally with DuckDB: Test and debug your models locally using DuckDB before deploying them.

Here's an example of debugging in a Python model:

```python
def model(dbt, session):
dbt.config(
materialized = "table"
)
df = dbt.ref("my_source_table").df()
# One option for debugging: write messages to temporary table column
# Pros: visibility
# Cons: won't work if table isn't building for some reason
msg = "something"
df["debugging"] = f"My debug message here: {msg}"
return df
```
</Expandable>

As a general rule, if there's a transformation you could write equally well in SQL or Python, we believe that well-written SQL is preferable: it's more accessible to a greater number of colleagues, and it's easier to write code that's performant at scale. If there's a transformation you _can't_ write in SQL, or where ten lines of elegant and well-annotated Python could save you 1000 lines of hard-to-read Jinja-SQL, Python is the way to go.

## Specific data platforms {#specific-data-platforms}
Expand Down
2 changes: 1 addition & 1 deletion website/docs/docs/build/snapshots.md
Original file line number Diff line number Diff line change
Expand Up @@ -404,7 +404,7 @@ Snapshot <Term id="table">tables</Term> will be created as a clone of your sourc

Starting in 1.9 or with [dbt Cloud Versionless](/docs/dbt-versions/upgrade-dbt-version-in-cloud#versionless):
- These column names can be customized to your team or organizational conventions using the [`snapshot_meta_column_names`](/reference/resource-configs/snapshot_meta_column_names) config.
- Use the `dbt_valid_to_current` config to set a custom indicator for the value of `dbt_valid_to` in current snapshot records (like a future date such as `9999-12-31`). By default, this value is `NULL`. When set, dbt will use this specified value instead of `NULL` for `dbt_valid_to` for current records in the snapshot table.
- Use the [`dbt_valid_to_current` config](/reference/resource-configs/dbt_valid_to_current) to set a custom indicator for the value of `dbt_valid_to` in current snapshot records (like a future date such as `9999-12-31`). By default, this value is `NULL`. When set, dbt will use this specified value instead of `NULL` for `dbt_valid_to` for current records in the snapshot table.

| Field | Meaning | Usage |
| -------------- | ------- | ----- |
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,7 @@ To access the features, you should meet the following:
3. You have set up a [production](/docs/deploy/deploy-environments#set-as-production-environment) deployment environment for each project you want to explore, with at least one successful job run.
4. You have [admin permissions](/docs/cloud/manage-access/enterprise-permissions) in dbt Cloud to edit project settings or production environment settings.
5. Use Tableau as your BI tool and enable metadata permissions or work with an admin to do so. Compatible with Tableau Cloud or Tableau Server with the Metadata API enabled.
- If you're using Tableau Server, you need to [allowlist dbt Cloud's IP addresses](/docs/cloud/about-cloud/access-regions-ip-addresses) for your dbt Cloud region.

## Set up in Tableau

Expand Down
7 changes: 6 additions & 1 deletion website/docs/docs/collaborate/auto-exposures.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,12 @@ As a data team, it’s critical that you have context into the downstream use ca

Auto-exposures helps users understand how their models are used in downstream analytics tools to inform investments and reduce incidents — ultimately building trust and confidence in data products. It imports and auto-generates exposures based on Tableau dashboards, with user-defined curation.

Auto-exposures is available on [Versionless](/docs/dbt-versions/versionless-cloud) and on [dbt Cloud Enterprise](https://www.getdbt.com/pricing/) plans.
## Supported plans
Auto-exposures is available on [Versionless](/docs/dbt-versions/versionless-cloud) and for [dbt Cloud Enterprise](https://www.getdbt.com/pricing/) plans.

:::info Tableau Server
If you're using Tableau Server, you need to [allowlist dbt Cloud's IP addresses](/docs/cloud/about-cloud/access-regions-ip-addresses) for your dbt Cloud region.
:::

For more information on how to set up auto-exposures, prerequisites, and more &mdash; refer to [configure auto-exposures in Tableau and dbt Cloud](/docs/cloud-integrations/configure-auto-exposures).

Expand Down
14 changes: 11 additions & 3 deletions website/docs/docs/collaborate/model-query-history.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,12 +13,19 @@ Model query history allows you to:
- Provides data teams insight, so they can focus their time and infrastructure spend on the worthwhile used data products.
- Enable analysts to find the most popular models used by other people.

Model query history is powered by a single consumption query of the query log table in your data warehouse aggregated on a daily basis. It currently supports Snowflake and BigQuery only, with additional platforms coming soon.
Model query history is powered by a single consumption query of the query log table in your data warehouse aggregated on a daily basis.

<Expandable alt_header="What is a consumption query?">

:::info What is a consumption query?
Consumption query is a metric of queries in your dbt project that has used the model in a given time. It filters down to `select` statements only to gauge model consumption and excludes dbt model build and test executions.

So for example, if `model_super_santi` was queried 10 times in the past week, it would count as having 10 consumption queries for that particular time period.
</Expandable>


:::info Support for Snowflake (Enterprise tier or higher) and BigQuery

Model query history for Snowflake users is **only available for Enterprise tier or higher**. The feature also supports BigQuery. Additional platforms coming soon.
:::

## Prerequisites
Expand All @@ -28,7 +35,8 @@ To access the features, you should meet the following:
1. You have a dbt Cloud account on the [Enterprise plan](https://www.getdbt.com/pricing/). Single-tenant accounts should contact their account representative for setup.
2. You have set up a [production](https://docs.getdbt.com/docs/deploy/deploy-environments#set-as-production-environment) deployment environment for each project you want to explore, with at least one successful job run.
3. You have [admin permissions](/docs/cloud/manage-access/enterprise-permissions) in dbt Cloud to edit project settings or production environment settings.
4. Use Snowflake (Enterprise tier or higher) or BigQuery as your data warehouse and can enable query history permissions or work with an admin to do so. Support for additional data platforms coming soon.
4. Use Snowflake or BigQuery as your data warehouse and can enable query history permissions or work with an admin to do so. Support for additional data platforms coming soon.
- For Snowflake users: You **must** have a Snowflake Enterprise tier or higher subscription.

## Enable query history in dbt Cloud

Expand Down
20 changes: 10 additions & 10 deletions website/docs/docs/core/connect-data-platform/risingwave-setup.md
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,7 @@ import SetUpPages from '/snippets/_setup-pages-intro.md';

## Connecting to RisingWave with dbt-risingwave

Before connecting to RisingWave, ensure that RisingWave is installed and running. For more information about how to get RisingWave up and running, see the [RisingWave quick start guide](https://docs.risingwave.com/docs/dev/get-started/).
Before connecting to RisingWave, ensure that RisingWave is installed and running. For more information about how to get RisingWave up and running, see the [RisingWave quick start guide](https://docs.risingwave.com/get-started/quickstart).

To connect to RisingWave with dbt, you need to add a RisingWave profile to your dbt profile file (`~/.dbt/profiles.yml`). Below is an example RisingWave profile. Revise the field values when necessary.

Expand Down Expand Up @@ -71,17 +71,17 @@ The dbt models for managing data transformations in RisingWave are similar to ty

|Materializations| Supported|Notes|
|----|----|----|
|`table` |Yes |Creates a [table](https://docs.risingwave.com/docs/dev/sql-create-table/). To use this materialization, add `{{ config(materialized='table') }}` to your model SQL files. |
|`view`|Yes | Creates a [view](https://docs.risingwave.com/docs/dev/sql-create-view/). To use this materialization, add `{{ config(materialized='view') }}` to your model SQL files. |
|`ephemeral`|Yes| This materialization uses [common table expressions](https://docs.risingwave.com/docs/dev/query-syntax-with-clause/) in RisingWave under the hood. To use this materialization, add `{{ config(materialized='ephemeral') }}` to your model SQL files.|
|`table` |Yes |Creates a [table](https://docs.risingwave.com/sql/commands/sql-create-table). To use this materialization, add `{{ config(materialized='table') }}` to your model SQL files. |
|`view`|Yes | Creates a [view](https://docs.risingwave.com/sql/commands/sql-create-view). To use this materialization, add `{{ config(materialized='view') }}` to your model SQL files. |
|`ephemeral`|Yes| This materialization uses [common table expressions](https://docs.risingwave.com/sql/query-syntax/with-clause) in RisingWave under the hood. To use this materialization, add `{{ config(materialized='ephemeral') }}` to your model SQL files.|
|`materializedview`| To be deprecated. |It is available only for backward compatibility purposes (for v1.5.1 of the dbt-risingwave adapter plugin). If you are using v1.6.0 and later versions of the dbt-risingwave adapter plugin, use `materialized_view` instead.|
|`materialized_view`| Yes| Creates a [materialized view](https://docs.risingwave.com/docs/dev/sql-create-mv/). This materialization corresponds the `incremental` one in dbt. To use this materialization, add `{{ config(materialized='materialized_view') }}` to your model SQL files.|
|`materialized_view`| Yes| Creates a [materialized view](https://docs.risingwave.com/sql/commands/sql-create-mv). This materialization corresponds the `incremental` one in dbt. To use this materialization, add `{{ config(materialized='materialized_view') }}` to your model SQL files.|
| `incremental`|No|Please use `materialized_view` instead. Since RisingWave is designed to use materialized view to manage data transformation in an incremental way, you can just use the `materialized_view` materialization.|
|`source`| Yes| Creates a [source](https://docs.risingwave.com/docs/dev/sql-create-source/). To use this materialization, add \{\{ config(materialized='source') \}\} to your model SQL files. You need to provide your create source statement as a whole in this model. See [Example model files](https://docs.risingwave.com/docs/dev/use-dbt/#example-model-files) for details.|
|`table_with_connector`| Yes| Creates a table with connector settings. In RisingWave, a table with connector settings is similar to a source. The difference is that a table object with connector settings persists raw streaming data in the source, while a source object does not. To use this materialization, add `{{ config(materialized='table_with_connector') }}` to your model SQL files. You need to provide your create table with connector statement as a whole in this model (see [Example model files](https://docs.risingwave.com/docs/dev/use-dbt/#example-model-files) for details). Because dbt tables have their own semantics, RisingWave use `table_with_connector` to distinguish itself from a dbt table.|
|`sink`| Yes| Creates a [sink](https://docs.risingwave.com/docs/dev/sql-create-sink/). To use this materialization, add `{{ config(materialized='sink') }}` to your SQL files. You need to provide your create sink statement as a whole in this model. See [Example model files](https://docs.risingwave.com/docs/dev/use-dbt/#example-model-files) for details.|
|`source`| Yes| Creates a [source](https://docs.risingwave.com/sql/commands/sql-create-source). To use this materialization, add \{\{ config(materialized='source') \}\} to your model SQL files. You need to provide your create source statement as a whole in this model. See [Example model files](https://docs.risingwave.com/integrations/other/dbt#example-model-files) for details.|
|`table_with_connector`| Yes| Creates a table with connector settings. In RisingWave, a table with connector settings is similar to a source. The difference is that a table object with connector settings persists raw streaming data in the source, while a source object does not. To use this materialization, add `{{ config(materialized='table_with_connector') }}` to your model SQL files. You need to provide your create table with connector statement as a whole in this model (see [Example model files](https://docs.risingwave.com/integrations/other/dbt#example-model-files) for details). Because dbt tables have their own semantics, RisingWave use `table_with_connector` to distinguish itself from a dbt table.|
|`sink`| Yes| Creates a [sink](https://docs.risingwave.com/sql/commands/sql-create-sink). To use this materialization, add `{{ config(materialized='sink') }}` to your SQL files. You need to provide your create sink statement as a whole in this model. See [Example model files](https://docs.risingwave.com/integrations/other/dbt#example-model-files) for details.|

## Resources

- [RisingWave's guide about using dbt for data transformations](https://docs.risingwave.com/docs/dev/use-dbt/)
- [A demo project using dbt to manage Nexmark benchmark queries in RisingWave](https://docs.risingwave.com/docs/dev/use-dbt/)
- [RisingWave's guide about using dbt for data transformations](https://docs.risingwave.com/integrations/other/dbt)
- [A demo project using dbt to manage Nexmark benchmark queries in RisingWave](https://github.com/risingwavelabs/dbt_rw_nexmark)
Original file line number Diff line number Diff line change
Expand Up @@ -63,6 +63,7 @@ Beginning in dbt Core 1.9, we've streamlined snapshot configuration and added a
- `target_schema` is now optional for snapshots: When omitted, snapshots will use the schema defined for the current environment.
- Standard `schema` and `database` configs supported: Snapshots will now be consistent with other dbt resource types. You can specify where environment-aware snapshots should be stored.
- Warning for incorrect `updated_at` data type: To ensure data integrity, you'll see a warning if the `updated_at` field specified in the snapshot configuration is not the proper data type or timestamp.
- Set a custom current indicator for the value of `dbt_valid_to`: Use the [`dbt_valid_to_current` config](/reference/resource-configs/dbt_valid_to_current) to set a custom indicator for the value of `dbt_valid_to` in current snapshot records (like a future date). By default, this value is `NULL`. When configured, dbt will use the specified value instead of `NULL` for `dbt_valid_to` for current records in the snapshot table.

Read more about [Snapshots meta fields](/docs/build/snapshots#snapshot-meta-fields).

Expand Down Expand Up @@ -107,6 +108,6 @@ You can read more about each of these behavior changes in the following links:
We also made some quality-of-life improvements in Core 1.9, enabling you to:

- Maintain data quality now that dbt returns an an error (versioned models) or warning (unversioned models) when someone [removes a contracted model by deleting, renaming, or disabling](/docs/collaborate/govern/model-contracts#how-are-breaking-changes-handled) it.
- Document [singular data tests](/docs/build/data-tests#singular-data-tests).
- Document [data tests](/reference/resource-properties/description).
- Use `ref` and `source` in [foreign key constraints](/reference/resource-properties/constraints).
- Use `dbt test` with the `--resource-type` / `--exclude-resource-type` flag, making it possible to include or exclude data tests (`test`) or unit tests (`unit_test`).
1 change: 0 additions & 1 deletion website/docs/faqs/Docs/long-descriptions.md
Original file line number Diff line number Diff line change
Expand Up @@ -32,4 +32,3 @@ If you need more than a sentence to explain a model, you can:
```

3. Use a [docs block](/docs/build/documentation#using-docs-blocks) to write the description in a separate Markdown file.
b
Original file line number Diff line number Diff line change
Expand Up @@ -4,19 +4,28 @@ description: "Edit your OAuth Security integration when you see error"
sidebar_label: 'Receiving `Failed to connect to database` error'
---

1. If you see this error:
1. If you see the following error:

```shell

```text
Failed to connect to DB: xxxxxxx.snowflakecomputing.com:443. The role requested in the connection, or the default role if none was requested in the connection ('xxxxx'), is not listed in the Access Token or was filtered.
Please specify another role, or contact your OAuth Authorization server administrator.

```

2. Edit your OAuth Security integration and explicitly specify this scope mapping attribute:

```sql
ALTER INTEGRATION <my_int_name> SET EXTERNAL_OAUTH_SCOPE_MAPPING_ATTRIBUTE = 'scp';
```
```sql
ALTER INTEGRATION <my_int_name> SET EXTERNAL_OAUTH_SCOPE_MAPPING_ATTRIBUTE = 'scp';
```

You can read more about this error in [Snowflake's documentation](https://community.snowflake.com/s/article/external-custom-oauth-error-the-role-requested-in-the-connection-is-not-listed-in-the-access-token).

----

1. If you see the following error:

```text
Failed to connect to DB: xxxxxxx.snowflakecomputing.com:443. Incorrect username or password was specified.
```

* **Unique email addresses** &mdash; Each user in Snowflake must have a unique email address. You can't have multiple users (for example, a human user and a service account) using the same email, such as `[email protected]`, to authenticate to Snowflake.
* **Match email addresses with identity provider** &mdash; The email address of your Snowflake user must exactly match the email address you use to authenticate with your Identity Provider (IdP). For example, if your Snowflake user's email is `[email protected]` but you log in to Entra or Okta with `[email protected]`, this mismatch can cause an error.
2 changes: 1 addition & 1 deletion website/docs/guides/zapier-ms-teams.md
Original file line number Diff line number Diff line change
Expand Up @@ -136,7 +136,7 @@ for step in run_data_results['run_steps']:
# Remove timestamp and any colour tags
full_log = re.sub('\x1b?\[[0-9]+m[0-9:]*', '', full_log)

summary_start = re.search('(?:Completed with \d+ errors? and \d+ warnings?:|Database Error|Compilation Error|Runtime Error)', full_log)
summary_start = re.search('(?:Completed with \d+ error.* and \d+ warnings?:|Database Error|Compilation Error|Runtime Error)', full_log)

line_items = re.findall('(^.*(?:Failure|Error) in .*\n.*\n.*)', full_log, re.MULTILINE)

Expand Down
Loading

0 comments on commit 9b68f75

Please sign in to comment.