-
Notifications
You must be signed in to change notification settings - Fork 90
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
WIP - Glue integration #529
WIP - Glue integration #529
Conversation
…run-kc/dbt-data-reliability into ele-47-add-integration-with-s3
{# Glue - truncate and insert (non-atomic) #} | ||
{% macro glue__replace_table_data(relation, rows) %} | ||
{% set intermediate_relation = elementary.create_intermediate_relation(relation, rows, temporary=True) %} | ||
{% do dbt.glue_exec_query(dbt.get_insert_overwrite_sql(intermediate_relation, relation)) %} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ideally should be using run_query
, but not able to pass the flag as 'False' for DDL and DML statements which in turn is causing issues at run_query
when it tries to change the case of columns into lowercase. So using glue_exec_query
for now.
Note: When DDL and DML statements are passed to run_query
in dbt it will return none
{% endfor %} | ||
{% elif insert_rows_method == 'chunk' %} | ||
{% set rows_chunks = elementary.split_list_to_chunks(rows, chunk_size) %} | ||
{% for rows_chunk in rows_chunks %} | ||
{% set insert_rows_query = elementary.get_chunk_insert_query(table_relation, columns, rows_chunk) %} | ||
{% do elementary.run_query(insert_rows_query) %} | ||
{% if target.type == 'glue' %} | ||
{% do dbt.glue_exec_query(insert_rows_query) %} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same reason as #r1318786335
@@ -22,13 +22,21 @@ | |||
{% set queries_len = insert_rows_queries | length %} | |||
{% for insert_query in insert_rows_queries %} | |||
{% do elementary.file_log("[{}/{}] Running insert query.".format(loop.index, queries_len)) %} | |||
{% do elementary.run_query(insert_query) %} | |||
{% if target.type == 'glue' %} | |||
{% do dbt.glue_exec_query(insert_query) %} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same reason as #r1318786335
@@ -57,7 +65,7 @@ | |||
{% do rendered_column_values.append(column_value) %} | |||
{% else %} | |||
{% set column_value = elementary.insensitive_get_dict_value(row, column.name) %} | |||
{% do rendered_column_values.append(elementary.render_value(column_value)) %} | |||
{% do rendered_column_values.append(elementary.render_value(column_value, column.data_type)) %} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Passing column data_type in order to cast timestamp columns for glue
Hi @Arun-kc , Since it's been a while and there are quite a few conflicts, I'll close this PR for now. |
[ELE-47] Add integration with S3 as a data lake