Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use deterministic job_ids to avoid retrying successful queries #977
Use deterministic job_ids to avoid retrying successful queries #977
Changes from 5 commits
a4e2b74
6056dec
28d9379
ce59491
6dcb67e
21322b5
c988153
ccb1a6e
24ebf39
afef1e8
8e71a27
6bf6c6a
e4c72f8
23180e1
8132504
6ebd25c
8a32a1e
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This could be over-engineering, but I would consider putting the contents of this if block into its own method.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
to be clear are you referring to lines 308-321?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm referring to lines 310-318. Then this would look like:
Something else worth considering is whether we want to handle all of the threads within a connection. I don't know if there's more than one thread for a connection, but I feel like there is. If there's a connection with more than one thread, you'll close that connection in the second condition above when you get to the first thread. Then you'll skip past the second condition for every other thread since
connection.state
should be closed at that point.I think what you probably want is a list of job_ids by connection. Then for each connection you would cancel the job. Once all jobs are cancelled, then close the connection.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we specifically want all connections which are not
this_connection
? Or do we only want connections in which we cancelled jobs? In this current flow, a connection for whichconnection.state == ConnectionState.CLOSED
will show up innames
, which doesn't feel like an intuitive list to get from `cancel_open'.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If i backtrack
open_cancel
to core it looks like one of the only places we call it is forcancel_open_connections
which does make me think the desired result is that all closed connections are accounted for?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In other words we should only be returning the connections which we cancelled during this call, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think
uuid.uuid4()
is deterministic, which meansjob_id
is not either. Have you considered anmd5
hash of sufficient attributes (model, connection name, etc.)?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
currently calling
uuid
directly as part of getting unit tests swapped over for functionality I think initial/current plan was to use theinvocation_id
we define via tracking in core https://docs.getdbt.com/reference/dbt-jinja-functions/invocation_id and it itself is auuid
based on docs.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of using invocation_id (which we only sometimes have) should we just use the actual query text (which we have to have)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the constraint on
job_id
? Is there a max length? Can all characters that go into a model name also be used in a job id?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will definitely have to test this, I think all characters are fine as we should just be combining 2 strings but length may hit a limit
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would want to make sure that we can submit a job_id to BQ with some weird characters. People put all kinds of things in their model names. An alternative is to hash the model_name so that it's only alpha-numeric.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 if we want to stick with uuid we can just generate a deterministic one with
uuid.uuid5