Skip to content

Latest commit

 

History

History
1080 lines (757 loc) · 33.6 KB

presentation.md

File metadata and controls

1080 lines (757 loc) · 33.6 KB
theme marp
mdsFest
true

From Overgrown to Thriving

Scaling Your dbt Project Like a Gardener

MDS FEST
From Overgrown to Thriving
1

<style scoped> small { color: gray; font-weight: thin; margin: 0; } section:where(.left) { display: flex; flex-flow: column nowrap; justify-content: center; } </style>

Nicholas Yager

they/them

Principal Analytics Engineer
HubSpot

bg right 75%


bg 103%


bg


dbt projects are tricky to scale

MDS FEST
From Overgrown to Thriving
5

dbt projects are tricky to scale

MDS FEST
From Overgrown to Thriving
6

dbt projects are tricky to scale

  1. Large organizations trend towards decentralization as they grow
MDS FEST
From Overgrown to Thriving
7

dbt projects are tricky to scale

  1. Large organizations trend towards decentralization as they grow
  2. Decentralization can lead to inconsistent standards and significant overhead
MDS FEST
From Overgrown to Thriving
8

dbt projects are tricky to scale

  1. Large organizations trend towards decentralization as they grow
  2. Decentralization can lead to inconsistent standards and significant overhead
  3. It's so easy to add "just one more" model
MDS FEST
From Overgrown to Thriving
9

This leads to sprawl

MDS FEST
From Overgrown to Thriving
10

bg 103%


The five steps

  1. Survey your garden
  2. Clear out the trash and weeds
  3. Renewal pruning
  4. Divide the perennials
  5. Keep the weeds under control
MDS FEST
From Overgrown to Thriving
12

Step One: Survey your garden

MDS FEST
From Overgrown to Thriving
13

Survey your garden

  1. What are your core entities?
  2. What are your exposures?
  3. How are your data consumers using your models?
  4. Are there any obvious architectural issues?
MDS FEST
From Overgrown to Thriving
14

Step Two: Clear out the weeds and trash

MDS FEST
From Overgrown to Thriving
15

bg 75%


Step Three: Renewal pruning

MDS FEST
From Overgrown to Thriving
17

bg 70%

MDS FEST
From Overgrown to Thriving
18

bg contain 90%

bg contain 79%

bg contain 83%

bg contains 65%

MDS FEST
From Overgrown to Thriving
19

bg 90%


bg center 90%


Step Four: Divide the perennials

MDS FEST
From Overgrown to Thriving
22

bg


Groups, access, and versions

MDS FEST
From Overgrown to Thriving
23

bg 90%

MDS FEST
From Overgrown to Thriving
24

groups:
  - name: revenue
    owner:
      email: [email protected]

  - name: customer_success
    owner:
      email: [email protected]
models:
  - name: deals
    group: revenue
    access: public

  - name: stg_crm__customers
    group: revenue
    access: private

bg 90%

MDS FEST
From Overgrown to Thriving
26

models:
  - name: deals
    group: revenue
    access: public

    columns:
      - name: deal_id
        data_type: int

      - name: favorite_color
        data_type: varchar

    latest_version: 1
    versions:
      - v: 1 # Version described above
        deprecation_date: 2023-08-30 # Deprecation warnings will be returned when referenced

      - v: 2 # The new version in pre-release. Removes favorite_color
        columns:
          - include: all
            exclude: [favorite_color]
MDS FEST
From Overgrown to Thriving
27

bg 90%

MDS FEST
From Overgrown to Thriving
28

models:
  - name: deals
    group: go_to_market
    access: public

    columns:
      - name: deal_id
        data_type: int

      - name: favorite_color
        data_type: varchar

    latest_version: 2 # Upgrade to the new version!!1!
    versions:
      - v: 1
        deprecation_date: 2023-08-30

      - v: 2
        columns:
          - include: all
            exclude: [favorite_color]
MDS FEST
From Overgrown to Thriving
29

bg 90%

MDS FEST
From Overgrown to Thriving
30

bg contain 90%


Multi-project deployments

⚠️ Caution: Prickly practice 🌵

MDS FEST
From Overgrown to Thriving
32

bg contain 75%

MDS FEST
From Overgrown to Thriving
33

bg 90% bg 90%


bg 90%


Step Five: Keep the weeds under control

MDS FEST
From Overgrown to Thriving
37

bg


Your process is more important than your tools

MDS FEST
From Overgrown to Thriving
38

Ways to keep the weeds under control

  1. Perform code reviews for every change, and make reviews easy!
MDS FEST
From Overgrown to Thriving
39

Ways to keep the weeds under control

  1. Perform code reviews for every change, and make reviews easy!
    • Show how your DAG changes in each PR
    • Pick your SQL coding conventions and enforce it using SQL formatters
    • Use CI/CD and dbt tests
MDS FEST
From Overgrown to Thriving
40

Ways to keep the weeds under control

  1. Perform code reviews for every change, and make reviews easy!
  2. Review your project's architecture often
MDS FEST
From Overgrown to Thriving
41

Ways to keep the weeds under control

  1. Perform code reviews for every change, and make reviews easy!
  2. Review your project's architecture often
    • Use an architecture evaluation tool like dbt Project Evaluator or Whetstone
    • Check for undervalued core entities
    • Be on the lookout for commonly joined models
MDS FEST
From Overgrown to Thriving
42

Ways to keep the weeds under control

  1. Perform code reviews for every change, and make reviews easy!
  2. Review your project's architecture often
  3. Periodically check to see if the execution behavior of your project has changed
MDS FEST
From Overgrown to Thriving
43

Ways to keep the weeds under control

  1. Perform code reviews for every change, and make reviews easy!
  2. Review your project's architecture often
  3. Periodically check to see if the execution behavior of your project has changed
    • Track materialization run times (dbt_artifacts or dbt Cloud) to find bottlenecks in your project
    • Leverage query usage data to identify unused models
MDS FEST
From Overgrown to Thriving
44

bg


Take a short break

and then grow an even better future

MDS FEST
From Overgrown to Thriving
45

Nicholas A. Yager

nicholasyager.com
github.com/nicholasyager

MDS FEST
From Overgrown to Thriving
46