Skip to content

Commit

Permalink
Merge branch 'nfiann-rbip' into new-branch-name
Browse files Browse the repository at this point in the history
  • Loading branch information
runleonarun authored Dec 6, 2024
2 parents e27c337 + 7f824a4 commit 9de3005
Show file tree
Hide file tree
Showing 6 changed files with 19 additions and 25 deletions.
7 changes: 2 additions & 5 deletions website/blog/2021-11-23-how-to-upgrade-dbt-versions.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,17 +12,14 @@ date: 2021-11-29
is_featured: true
---

import Latest from '/snippets/_release-stages-from-versionless.md'

<Latest/>

:::tip February 2024 Update

It's been a few years since dbt-core turned 1.0! Since then, we've committed to releasing zero breaking changes whenever possible and it's become much easier to upgrade dbt Core versions.

In 2024, we're taking this promise further by:

- Stabilizing interfaces for everyone — adapter maintainers, metadata consumers, and (of course) people writing dbt code everywhere — as discussed in [our November 2023 roadmap update](https://github.com/dbt-labs/dbt-core/blob/main/docs/roadmap/2023-11-dbt-tng.md).
- Introducing **Latest** release track in dbt Cloud. No more manual upgrades and no need for _a second sandbox project_ just to try out new features in development. For more details, refer to [Upgrade Core version in Cloud](/docs/dbt-versions/upgrade-dbt-version-in-cloud).
- Introducing [Release tracks](/docs/dbt-versions/cloud-release-tracks) (formerly known as Versionless) to dbt Cloud. No more manual upgrades and no need for _a second sandbox project_ just to try out new features in development. For more details, refer to [Upgrade Core version in Cloud](/docs/dbt-versions/upgrade-dbt-version-in-cloud).

We're leaving the rest of this post as is, so we can all remember how it used to be. Enjoy a stroll down memory lane.

Expand Down
6 changes: 1 addition & 5 deletions website/blog/2024-04-22-extended-attributes.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,10 +12,6 @@ date: 2024-04-22
is_featured: true
---

import Latest from '/snippets/_release-stages-from-versionless.md'

<Latest/>

dbt Cloud now includes a suite of new features that enable configuring precise and unique connections to data platforms at the environment and user level. These enable more sophisticated setups, like connecting a project to multiple warehouse accounts, first-class support for [staging environments](/docs/deploy/deploy-environments#staging-environment), and user-level [overrides for specific dbt versions](/docs/dbt-versions/upgrade-dbt-version-in-cloud#override-dbt-version). This gives dbt Cloud developers the features they need to tackle more complex tasks, like Write-Audit-Publish (WAP) workflows and safely testing dbt version upgrades. While you still configure a default connection at the project level and per-developer, you now have tools to get more advanced in a secure way. Soon, dbt Cloud will take this even further allowing multiple connections to be set globally and reused with _global connections_.

<!--truncate-->
Expand Down Expand Up @@ -84,7 +80,7 @@ All you need to do is configure an environment as staging and enable the **Defer

## Upgrading on a curve

Lastly, let’s consider a more specialized use case. Imagine we have a "tiger team" (consisting of a lone analytics engineer named Dave) tasked with upgrading from dbt version 1.6 to the new **Latest release track**, to take advantage of new features and performance improvements. We want to keep the rest of the data team being productive in dbt 1.6 for the time being, while enabling Dave to upgrade and do his work with Latest (and greatest) dbt.
Lastly, let’s consider a more specialized use case. Imagine we have a "tiger team" (consisting of a lone analytics engineer named Dave) tasked with upgrading from dbt version 1.6 to the new **[Latest release track](/docs/dbt-versions/cloud-release-tracks)**, to take advantage of new features and performance improvements. We want to keep the rest of the data team being productive in dbt 1.6 for the time being, while enabling Dave to upgrade and do his work with Latest (and greatest) dbt.

### Development environment

Expand Down
18 changes: 7 additions & 11 deletions website/blog/2024-06-12-putting-your-dag-on-the-internet.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,11 +12,7 @@ date: 2024-06-14
is_featured: true
---

import Latest from '/snippets/_release-stages-from-versionless.md'

<Latest/>

**New in dbt: allow Snowflake Python models to access the internet**
## New in dbt: allow Snowflake Python models to access the internet

With dbt 1.8, dbt released support for Snowflake’s [external access integrations](https://docs.snowflake.com/en/developer-guide/external-network-access/external-network-access-overview) further enabling the use of dbt + AI to enrich your data. This allows querying of external APIs within dbt Python models, a functionality that was required for dbt Cloud customer, [EQT AB](https://eqtgroup.com/). Learn about why they needed it and how they helped build the feature and get it shipped!

Expand Down Expand Up @@ -49,7 +45,7 @@ This API is open and if it requires an API key, handle it similarly to managing
For simplicity’s sake, we will show how to create them using [pre-hooks](/reference/resource-configs/pre-hook-post-hook) in a model configuration yml file:


```
```yml
models:
- name: external_access_sample
config:
Expand All @@ -61,7 +57,7 @@ models:
Then we can simply use the new external_access_integrations configuration parameter to use our network rule within a Python model (called external_access_sample.py):
```
```python
import snowflake.snowpark as snowpark
def model(dbt, session: snowpark.Session):
dbt.config(
Expand All @@ -79,7 +75,7 @@ def model(dbt, session: snowpark.Session):
The result is a model with some json I can parse, for example, in a SQL model to extract some information:


```
```sql
{{
config(
materialized='incremental',
Expand Down Expand Up @@ -112,12 +108,12 @@ The result is a model that will keep track of dbt invocations, and the current U

This is a very new area to Snowflake and dbt -- something special about SQL and dbt is that it’s very resistant to external entropy. The second we rely on API calls, Python packages and other external dependencies, we open up to a lot more external entropy. APIs will change, break, and your models could fail.

Traditionally dbt is the T in ELT (dbt overview [here](https://docs.getdbt.com/terms/elt)), and this functionality unlocks brand new EL capabilities for which best practices do not yet exist. What’s clear is that EL workloads should be separated from T workloads, perhaps in a different modeling layer. Note also that unless using incremental models, your historical data can easily be deleted. dbt has seen a lot of use cases for this, including this AI example as outlined in this external [engineering blog post](https://klimmy.hashnode.dev/enhancing-your-dbt-project-with-large-language-models).
Traditionally dbt is the T in ELT (dbt overview [here](https://docs.getdbt.com/terms/elt)), and this functionality unlocks brand new EL capabilities for which best practices do not yet exist. What’s clear is that EL workloads should be separated from T workloads, perhaps in a different modeling layer. Note also that unless using incremental models, your historical data can easily be deleted. dbt has seen a lot of use cases for this, including this AI example as outlined in this external [engineering blog post](https://klimmy.hashnode.dev/enhancing-your-dbt-project-with-large-language-models).

**A few words about the power of Commercial Open Source Software**
## A few words about the power of Commercial Open Source Software

In order to get this functionality shipped quickly, EQT opened a pull request, Snowflake helped with some problems we had with CI and a member of dbt Labs helped write the tests and merge the code in!

dbt now features this functionality in dbt 1.8+ and the "Latest" release track in dbt Cloud (dbt overview [here](/docs/dbt-versions/cloud-release-tracks)).
dbt now features this functionality in dbt 1.8+ and all [Release tracks](/docs/dbt-versions/cloud-release-tracks) in dbt Cloud.

dbt Labs staff and community members would love to chat more about it in the [#db-snowflake](https://getdbt.slack.com/archives/CJN7XRF1B) slack channel.
9 changes: 6 additions & 3 deletions website/docs/docs/build/incremental-microbatch.md
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,8 @@ Incremental models in dbt are a [materialization](/docs/build/materializations)
Microbatch is an incremental strategy designed for large time-series datasets:
- It relies solely on a time column ([`event_time`](/reference/resource-configs/event-time)) to define time-based ranges for filtering. Set the `event_time` column for your microbatch model and its direct parents (upstream models). Note, this is different to `partition_by`, which groups rows into partitions.
- It complements, rather than replaces, existing incremental strategies by focusing on efficiency and simplicity in batch processing.
- Unlike traditional incremental strategies, microbatch doesn't require implementing complex conditional logic for [backfilling](#backfills).
- Unlike traditional incremental strategies, microbatch enables you to [reprocess failed batches](/docs/build/incremental-microbatch#retry), auto-detect [parallel batch execution](#parallel-batch-execution), and eliminate the need to implement complex conditional logic for [backfilling](#backfills).

- Note, microbatch might not be the best strategy for all use cases. Consider other strategies for use cases such as not having a reliable `event_time` column or if you want more control over the incremental logic. Read more in [How `microbatch` compares to other incremental strategies](#how-microbatch-compares-to-other-incremental-strategies).

### How microbatch works
Expand Down Expand Up @@ -300,9 +301,11 @@ To enable parallel execution, you must meet the following conditions:
- You use the following supported adapters:
- Snowflake
- Databricks
- More adapters coming soon!
- More adapters coming soon!
- We'll be continuing to test and add concurrency support for adapters. This means that some adapters might get concurrency support _after_ the 1.9 initial release.



We'll be continuing to test and add concurrency support for adapters. This means that some adapters might get concurrency support _after_ the 1.9 initial release.
- You meet [additional conditions](#how-parallel-batch-execution-works) mentioned in the next section

### How parallel batch execution works
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ description: "Read this guide to learn about the Microsoft SQL Server warehouse
id: "mssql-setup"
meta:
maintained_by: Community
authors: 'dbt-msft community (https://github.com/dbt-msft)'
authors: 'Mikael Ene & dbt-msft community (https://github.com/dbt-msft)'
github_repo: 'dbt-msft/dbt-sqlserver'
pypi_package: 'dbt-sqlserver'
min_core_version: 'v0.14.0'
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -49,6 +49,8 @@ Starting in Core 1.9, you can use the new [microbatch strategy](/docs/build/incr
- Simplified query design: Write your model query for a single batch of data. dbt will use your `event_time``lookback`, and `batch_size` configurations to automatically generate the necessary filters for you, making the process more streamlined and reducing the need for you to manage these details.
- Independent batch processing: dbt automatically breaks down the data to load into smaller batches based on the specified `batch_size` and processes each batch independently, improving efficiency and reducing the risk of query timeouts. If some of your batches fail, you can use `dbt retry` to load only the failed batches.
- Targeted reprocessing: To load a *specific* batch or batches, you can use the CLI arguments `--event-time-start` and `--event-time-end`.
- [Automatic parallel batch execution](/docs/build/incremental-microbatch#parallel-batch-execution): Process multiple batches at the same time, instead of one after the other (sequentially) for faster processing of your microbatch models. dbt intelligently auto-detects if your batches can run in parallel, while also allowing you to manually override parallel execution with the `concurrent_batches` config.


Currently microbatch is supported on these adapters with more to come:
* postgres
Expand Down

0 comments on commit 9de3005

Please sign in to comment.