WIP - Glue integration #529

Arun-kc · 2023-09-07T15:17:11Z

[ELE-47] Add integration with S3 as a data lake

…h-s3

…run-kc/dbt-data-reliability into ele-47-add-integration-with-s3

…h-s3

Arun-kc · 2023-09-07T15:31:16Z

macros/utils/table_operations/replace_table_data.sql

+{# Glue - truncate and insert (non-atomic) #}
+{% macro glue__replace_table_data(relation, rows) %}
+    {% set intermediate_relation = elementary.create_intermediate_relation(relation, rows, temporary=True) %}
+    {% do dbt.glue_exec_query(dbt.get_insert_overwrite_sql(intermediate_relation, relation)) %}


Ideally should be using run_query, but not able to pass the flag as 'False' for DDL and DML statements which in turn is causing issues at run_query when it tries to change the case of columns into lowercase. So using glue_exec_query for now.

Note: When DDL and DML statements are passed to run_query in dbt it will return none

Arun-kc · 2023-09-07T15:34:42Z

macros/utils/table_operations/insert_rows.sql

      {% endfor %}
    {% elif insert_rows_method == 'chunk' %}
      {% set rows_chunks = elementary.split_list_to_chunks(rows, chunk_size) %}
      {% for rows_chunk in rows_chunks %}
        {% set insert_rows_query = elementary.get_chunk_insert_query(table_relation, columns, rows_chunk) %}
-        {% do elementary.run_query(insert_rows_query) %}
+        {% if target.type == 'glue' %}
+          {% do dbt.glue_exec_query(insert_rows_query) %}


Same reason as #r1318786335

Arun-kc · 2023-09-07T15:34:49Z

macros/utils/table_operations/insert_rows.sql

@@ -22,13 +22,21 @@
      {% set queries_len = insert_rows_queries | length %}
      {% for insert_query in insert_rows_queries %}
        {% do elementary.file_log("[{}/{}] Running insert query.".format(loop.index, queries_len)) %}
-        {% do elementary.run_query(insert_query) %}
+        {% if target.type == 'glue' %}
+          {% do dbt.glue_exec_query(insert_query) %}


Same reason as #r1318786335

Arun-kc · 2023-09-07T15:36:05Z

macros/utils/table_operations/insert_rows.sql

@@ -57,7 +65,7 @@
          {% do rendered_column_values.append(column_value) %}
        {% else %}
          {% set column_value = elementary.insensitive_get_dict_value(row, column.name) %}
-          {% do rendered_column_values.append(elementary.render_value(column_value)) %}
+          {% do rendered_column_values.append(elementary.render_value(column_value, column.data_type)) %}


Passing column data_type in order to cast timestamp columns for glue

…run-kc/dbt-data-reliability into ele-47-add-integration-with-s3

haritamar · 2024-05-28T10:56:32Z

Hi @Arun-kc ,
Thanks for this work and sorry it took us so long to address this PR. We are working on improving our process with PRs and Github issues and admittedly these have been neglected in the past few months due to other priorities.

Since it's been a while and there are quite a few conflicts, I'll close this PR for now.
If you still wish to contribute this and can update the code, I'll be happy to review. Ensuring that integration tests pass on the Glue adapter (here) will also help to speed the review process.

Arun-kc added 23 commits July 29, 2023 17:13

fix: indentation error of generate_schema_baseline_test macro

5b9dcac

feat: add incremental_strategy for dbt-glue

e05cd0d

Merge branch 'elementary-data:master' into ele-47-add-integration-wit…

6f3eb70

…h-s3

Merge branch 'ele-47-add-integration-with-s3' of https://github.com/A…

9ded65c

…run-kc/dbt-data-reliability into ele-47-add-integration-with-s3

feat: add glue incremental materialization

4cce0c1

feat: add glue table materialization

1cf8b8f

Merge branch 'elementary-data:master' into ele-47-add-integration-wit…

f2d0e38

…h-s3

Merge branch 'elementary-data:master' into ele-47-add-integration-wit…

d2580c2

…h-s3

feat: add glue__replace_table_data

1b24722

Merge branch 'elementary-data:master' into ele-47-add-integration-wit…

ee85c5d

…h-s3

Merge branch 'elementary-data:master' into ele-47-add-integration-wit…

afc20cd

…h-s3

Merge branch 'elementary-data:master' into ele-47-add-integration-wit…

9d27bf9

…h-s3

fix: update glue incremental materialization

4445b12

update create_table_like macro with glue

b65eea4

add glue to replace_table_data

0a0aa76

fixed glue__replace_table_data

452f0f2

add glue to make_temp_relation

3587f5c

update edr_quote_column with glue

0f5887e

add glue to insert_rows

74bb6c4

Merge branch 'elementary-data:master' into ele-47-add-integration-wit…

a890dc3

…h-s3

fix: explicit cast in query_table_metrics

169d494

add glue timestamp cast

82907bb

Merge branch 'elementary-data:master' into ele-47-add-integration-wit…

62e366f

…h-s3

Arun-kc commented Sep 7, 2023

View reviewed changes

Arun-kc added 3 commits September 8, 2023 20:46

change models on_schema_change

b93b90c

Merge branch 'ele-47-add-integration-with-s3' of https://github.com/A…

685d52e

…run-kc/dbt-data-reliability into ele-47-add-integration-with-s3

change model on_schema_change

87a09fb

haritamar closed this May 28, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

WIP - Glue integration #529

WIP - Glue integration #529

Arun-kc commented Sep 7, 2023

Arun-kc Sep 7, 2023

Arun-kc Sep 7, 2023

Arun-kc Sep 7, 2023

Arun-kc Sep 7, 2023

haritamar commented May 28, 2024

WIP - Glue integration #529

WIP - Glue integration #529

Conversation

Arun-kc commented Sep 7, 2023

Arun-kc Sep 7, 2023

Choose a reason for hiding this comment

Arun-kc Sep 7, 2023

Choose a reason for hiding this comment

Arun-kc Sep 7, 2023

Choose a reason for hiding this comment

Arun-kc Sep 7, 2023

Choose a reason for hiding this comment

haritamar commented May 28, 2024