Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP - Glue integration #529

Closed

Conversation

Arun-kc
Copy link
Contributor

@Arun-kc Arun-kc commented Sep 7, 2023

{# Glue - truncate and insert (non-atomic) #}
{% macro glue__replace_table_data(relation, rows) %}
{% set intermediate_relation = elementary.create_intermediate_relation(relation, rows, temporary=True) %}
{% do dbt.glue_exec_query(dbt.get_insert_overwrite_sql(intermediate_relation, relation)) %}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ideally should be using run_query, but not able to pass the flag as 'False' for DDL and DML statements which in turn is causing issues at run_query when it tries to change the case of columns into lowercase. So using glue_exec_query for now.

Note: When DDL and DML statements are passed to run_query in dbt it will return none

{% endfor %}
{% elif insert_rows_method == 'chunk' %}
{% set rows_chunks = elementary.split_list_to_chunks(rows, chunk_size) %}
{% for rows_chunk in rows_chunks %}
{% set insert_rows_query = elementary.get_chunk_insert_query(table_relation, columns, rows_chunk) %}
{% do elementary.run_query(insert_rows_query) %}
{% if target.type == 'glue' %}
{% do dbt.glue_exec_query(insert_rows_query) %}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same reason as #r1318786335

@@ -22,13 +22,21 @@
{% set queries_len = insert_rows_queries | length %}
{% for insert_query in insert_rows_queries %}
{% do elementary.file_log("[{}/{}] Running insert query.".format(loop.index, queries_len)) %}
{% do elementary.run_query(insert_query) %}
{% if target.type == 'glue' %}
{% do dbt.glue_exec_query(insert_query) %}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same reason as #r1318786335

@@ -57,7 +65,7 @@
{% do rendered_column_values.append(column_value) %}
{% else %}
{% set column_value = elementary.insensitive_get_dict_value(row, column.name) %}
{% do rendered_column_values.append(elementary.render_value(column_value)) %}
{% do rendered_column_values.append(elementary.render_value(column_value, column.data_type)) %}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Passing column data_type in order to cast timestamp columns for glue

@haritamar
Copy link
Collaborator

Hi @Arun-kc ,
Thanks for this work and sorry it took us so long to address this PR. We are working on improving our process with PRs and Github issues and admittedly these have been neglected in the past few months due to other priorities.

Since it's been a while and there are quite a few conflicts, I'll close this PR for now.
If you still wish to contribute this and can update the code, I'll be happy to review. Ensuring that integration tests pass on the Glue adapter (here) will also help to speed the review process.

@haritamar haritamar closed this May 28, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants