Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for Iceberg Table Materialization #1170

Merged
merged 29 commits into from
Sep 12, 2024
Merged

Conversation

VersusFacit
Copy link
Contributor

@VersusFacit VersusFacit commented Aug 22, 2024

resolves: #321

Add support for Iceberg Table Materialization

Built atop jaffle shop classic for testing.

Problem

We want to add support for materialization S3 bucket Iceberg tables in Snowflake.

Example of our decided model config interface:

{{
  config(
    materialized = "table",
    table_format="iceberg",
    external_volume="s3_iceberg_snow",
    base_location="milas_working_folder",
  )
}}
-- transient gets auto-`falsed`
-- `transient=true` is ignored
select * from {{ ref('raw_orders') }}

Solution

  • Precisely insert additional Iceberg DDL into existing table materialization
  • load all custom logic possible into Python
  • manually attach is_iceberg as a field to show objects (this will be given to us by Snowflake for free one day)
  • use is_iceberg to determine drop_relation_if_exists calls before building a table (table -> Iceberg, Iceberg -> table)
  • gatekeep behind a behavior flag for those who have no need of Iceberg format tables!

Caveats

this is ready for review but needs two more things I can identify:

  1. a behavior flag to gate Iceberg (did we merge that capability yet?)
  2. activate tests (but while we have an Iceberg volume on our team cluster, it doesn't seem it's live yet on the CI one here)

Impact

Get this shipped for Cloud folks to start using!!

Manually testing scenarios

  • build an Iceberg table that refs a table
  • build an iceberg table on top of a table
  • seamless creates from table to iceberg table and back
  • did stress tests to confirm memory usage / time tradeoff of left join with show objects was worth it
  • a complex config model, with cluster by and iceberg at the same time

Checklist

  • I have read the contributing guide and understand what's expected of me
  • I have run this code in development and it appears to resolve the stated issue
  • (well, err, sort of) edit: Now, this PR includes tests, or tests are not required/relevant for this PR
  • This PR has no interface changes (e.g. macros, cli, logs, json artifacts, config files, adapter interface, etc) or this PR has already received feedback and approval from Product or DX

@cla-bot cla-bot bot added the cla:yes label Aug 22, 2024
Copy link
Contributor

Thank you for your pull request! We could not find a changelog entry for this change. For details on how to document a change, see the dbt-snowflake contributing guide.

@VersusFacit VersusFacit force-pushed the mp/add_iceberg_tables branch 2 times, most recently from c959b7b to 7da28be Compare August 29, 2024 06:51
@VersusFacit VersusFacit force-pushed the mp/add_iceberg_tables branch from 7da28be to 1887208 Compare August 29, 2024 07:10
@VersusFacit VersusFacit force-pushed the mp/add_iceberg_tables branch from 32ad646 to 8150261 Compare August 29, 2024 07:32
@VersusFacit VersusFacit changed the title Iceberg support prototyping Add support for Iceberg Table Materialization Sep 11, 2024
@VersusFacit VersusFacit marked this pull request as ready for review September 11, 2024 08:57
@VersusFacit VersusFacit requested a review from a team as a code owner September 11, 2024 08:57
dbt/adapters/snowflake/relation_configs/formats.py Outdated Show resolved Hide resolved
dbt/include/snowflake/macros/adapters.sql Outdated Show resolved Hide resolved
dbt/adapters/snowflake/relation.py Outdated Show resolved Hide resolved
dbt/adapters/snowflake/relation.py Outdated Show resolved Hide resolved
tests/functional/iceberg/test_table_basic.py Outdated Show resolved Hide resolved
tests/functional/iceberg/test_table_basic.py Outdated Show resolved Hide resolved
dbt/include/snowflake/macros/materializations/table.sql Outdated Show resolved Hide resolved
dbt/adapters/snowflake/impl.py Outdated Show resolved Hide resolved
@VersusFacit
Copy link
Contributor Author

Blocked by behavior flag / Jinja security issue. Once we have a solution forward, this PR is ready for code review round 2!

@VersusFacit VersusFacit merged commit 49623d7 into main Sep 12, 2024
19 checks passed
@VersusFacit VersusFacit deleted the mp/add_iceberg_tables branch September 12, 2024 20:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[CT-1492] [Feature] Add support for Iceberg Tables
3 participants