theme | marp |
---|---|
mdsFest |
true |
Scaling Your dbt Project Like a Gardener
<style scoped> small { color: gray; font-weight: thin; margin: 0; } section:where(.left) { display: flex; flex-flow: column nowrap; justify-content: center; } </style> they/them
Principal Analytics Engineer
HubSpot
- Large organizations trend towards decentralization as they grow
- Large organizations trend towards decentralization as they grow
- Decentralization can lead to inconsistent standards and significant overhead
- Large organizations trend towards decentralization as they grow
- Decentralization can lead to inconsistent standards and significant overhead
- It's so easy to add "just one more" model
- Survey your garden
- Clear out the trash and weeds
- Renewal pruning
- Divide the perennials
- Keep the weeds under control
- What are your core entities?
- What are your exposures?
- How are your data consumers using your models?
- Are there any obvious architectural issues?
groups:
- name: revenue
owner:
email: [email protected]
- name: customer_success
owner:
email: [email protected]
models:
- name: deals
group: revenue
access: public
- name: stg_crm__customers
group: revenue
access: private
models:
- name: deals
group: revenue
access: public
columns:
- name: deal_id
data_type: int
- name: favorite_color
data_type: varchar
latest_version: 1
versions:
- v: 1 # Version described above
deprecation_date: 2023-08-30 # Deprecation warnings will be returned when referenced
- v: 2 # The new version in pre-release. Removes favorite_color
columns:
- include: all
exclude: [favorite_color]
models:
- name: deals
group: go_to_market
access: public
columns:
- name: deal_id
data_type: int
- name: favorite_color
data_type: varchar
latest_version: 2 # Upgrade to the new version!!1!
versions:
- v: 1
deprecation_date: 2023-08-30
- v: 2
columns:
- include: all
exclude: [favorite_color]
- Perform code reviews for every change, and make reviews easy!
- Perform code reviews for every change, and make reviews easy!
- Show how your DAG changes in each PR
- Pick your SQL coding conventions and enforce it using SQL formatters
- Use CI/CD and dbt tests
- Perform code reviews for every change, and make reviews easy!
- Review your project's architecture often
- Perform code reviews for every change, and make reviews easy!
- Review your project's architecture often
- Use an architecture evaluation tool like dbt Project Evaluator or Whetstone
- Check for undervalued core entities
- Be on the lookout for commonly joined models
- Perform code reviews for every change, and make reviews easy!
- Review your project's architecture often
- Periodically check to see if the execution behavior of your project has changed
- Perform code reviews for every change, and make reviews easy!
- Review your project's architecture often
- Periodically check to see if the execution behavior of your project has changed
- Track materialization run times (dbt_artifacts or dbt Cloud) to find bottlenecks in your project
- Leverage query usage data to identify unused models
and then grow an even better future
Nicholas A. Yager