Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix assorted source freshness edgecases so check is run or actionable information #9825

Merged
merged 5 commits into from
Mar 27, 2024

Conversation

QMalcolm
Copy link
Contributor

resolves #9078

Problem

We were hitting multiple edge cases in the source freshness task which were raising unhelpful error messages.

Solution

  • update BaseTask to gracefully handle nodes without build_path
  • ensure invocations of Note events use keyword params
  • raise a runtime error when a source freshness task is missing loaded_at_field AND the project adapter doesn't support metadata checks

Checklist

  • I have read the contributing guide and understand what's expected of me
  • I have run this code in development and it appears to resolve the stated issue
  • This PR includes tests, or tests are not required/relevant for this PR
  • This PR has no interface changes (e.g. macros, cli, logs, json artifacts, config files, adapter interface, etc) or this PR has already received feedback and approval from Product or DX
  • This PR includes type annotations for new and modified functions

@QMalcolm QMalcolm requested a review from a team as a code owner March 26, 2024 23:26
@cla-bot cla-bot bot added the cla:yes label Mar 26, 2024
Some nodes, like SourceDefinition nodes, don't have a `build_path` property.
This is problematic because we take in nodes with no type checking, and
assume they have properties sometimes, like `build_path`. This was just
the case in BaseRunner's `_handle_generic_exception` and
`_handle_interal_exception` methods. Thus to stop dbt from crashing when
trying to handle an exception related to a node without a `build_path`,
we added an private method to the BaseRunner class for safely trying
to get `build_path`.
Previously we were passing arguments during the `Note` event instantiations
in freshness.py as positional arguments. This would cause not the desired
`Note` event to be emitted, but instead get the message
```
[Note] Don't use positional arguments when constructing logging events
```
which was our fault, not the users'. Additionally, we were passing the
level for the event in the `Note` instantiation when we needed to be
passing it to the `fire_event` method.
… possible

Previously if a source freshness check didn't have a `loaded_at_field` and
metadata source freshness wasn't supported by the adapter, then we'd log
a warning message and let the source freshness check continue. This was problematic
because the source freshness check couldn't actually continue and the process
would raise an error in the form
```
type object argument after ** must be a mapping, not NoneType
```
because the `freshness` variable was never getting set. This error wasn't particularly
helpful for any person running into it. So instead of letting that error
happen we now deliberately raise an error with helpful information.
Copy link

codecov bot commented Mar 26, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 88.10%. Comparing base (952cca8) to head (1abcaa4).
Report is 12 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #9825      +/-   ##
==========================================
+ Coverage   88.06%   88.10%   +0.03%     
==========================================
  Files         178      178              
  Lines       22495    22461      -34     
==========================================
- Hits        19810    19789      -21     
+ Misses       2685     2672      -13     
Flag Coverage Δ
integration 85.46% <66.66%> (+0.01%) ⬆️
unit 61.79% <66.66%> (-0.01%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

…error

This test directly tests that when a source freshness check doesn't have a
`loaded_at_field` and the adapter in use doesn't support metadata checks,
then the appropriate error message gets raised. That is, it directly tests
the change made in a162d53. This test indirectly tests the changes in both
7ec2f82 and 7b0ff31 as the appropriate error can only be raised because
we've fixed other upstream issues via those commits.
@QMalcolm QMalcolm force-pushed the qmalcolm--9078-fix-source-freshness-error-handling branch from 7d6f64e to 1abcaa4 Compare March 26, 2024 23:30
@QMalcolm QMalcolm requested review from MichelleArk and emmyoop March 26, 2024 23:31
@QMalcolm QMalcolm merged commit a029661 into main Mar 27, 2024
62 checks passed
@QMalcolm QMalcolm deleted the qmalcolm--9078-fix-source-freshness-error-handling branch March 27, 2024 15:41
Copy link
Contributor

The backport to 1.7.latest failed:

The process '/usr/bin/git' failed with exit code 1

To backport manually, run these commands in your terminal:

# Fetch latest updates from GitHub
git fetch
# Create a new working tree
git worktree add .worktrees/backport-1.7.latest 1.7.latest
# Navigate to the new working tree
cd .worktrees/backport-1.7.latest
# Create a new branch
git switch --create backport-9825-to-1.7.latest
# Cherry-pick the merged commit of this pull request and resolve the conflicts
git cherry-pick -x --mainline 1 a029661e237357087823cf8052a1970bed9e39cd
# Push it to GitHub
git push --set-upstream origin backport-9825-to-1.7.latest
# Go back to the original working tree
cd ../..
# Delete the working tree
git worktree remove .worktrees/backport-1.7.latest

Then, create a pull request where the base branch is 1.7.latest and the compare/head branch is backport-9825-to-1.7.latest.

QMalcolm added a commit that referenced this pull request Mar 27, 2024
… information (#9825)

* Ensure BaseRunner handles nodes without `build_path`

Some nodes, like SourceDefinition nodes, don't have a `build_path` property.
This is problematic because we take in nodes with no type checking, and
assume they have properties sometimes, like `build_path`. This was just
the case in BaseRunner's `_handle_generic_exception` and
`_handle_interal_exception` methods. Thus to stop dbt from crashing when
trying to handle an exception related to a node without a `build_path`,
we added an private method to the BaseRunner class for safely trying
to get `build_path`.

* Use keyword arguments when instantiating `Note` events in freshness.py

Previously we were passing arguments during the `Note` event instantiations
in freshness.py as positional arguments. This would cause not the desired
`Note` event to be emitted, but instead get the message
```
[Note] Don't use positional arguments when constructing logging events
```
which was our fault, not the users'. Additionally, we were passing the
level for the event in the `Note` instantiation when we needed to be
passing it to the `fire_event` method.

* Raise error when `loaded_at_field` is `None` and metadata check isn't possible

Previously if a source freshness check didn't have a `loaded_at_field` and
metadata source freshness wasn't supported by the adapter, then we'd log
a warning message and let the source freshness check continue. This was problematic
because the source freshness check couldn't actually continue and the process
would raise an error in the form
```
type object argument after ** must be a mapping, not NoneType
```
because the `freshness` variable was never getting set. This error wasn't particularly
helpful for any person running into it. So instead of letting that error
happen we now deliberately raise an error with helpful information.

* Add test which ensures bad source freshness checks raise appropriate error

This test directly tests that when a source freshness check doesn't have a
`loaded_at_field` and the adapter in use doesn't support metadata checks,
then the appropriate error message gets raised. That is, it directly tests
the change made in a162d53. This test indirectly tests the changes in both
7ec2f82 and 7b0ff31 as the appropriate error can only be raised because
we've fixed other upstream issues via those commits.

* Add changelog entry for source freshness edgecase fixes
QMalcolm added a commit that referenced this pull request Mar 27, 2024
… information (#9825)

* Ensure BaseRunner handles nodes without `build_path`

Some nodes, like SourceDefinition nodes, don't have a `build_path` property.
This is problematic because we take in nodes with no type checking, and
assume they have properties sometimes, like `build_path`. This was just
the case in BaseRunner's `_handle_generic_exception` and
`_handle_interal_exception` methods. Thus to stop dbt from crashing when
trying to handle an exception related to a node without a `build_path`,
we added an private method to the BaseRunner class for safely trying
to get `build_path`.

* Use keyword arguments when instantiating `Note` events in freshness.py

Previously we were passing arguments during the `Note` event instantiations
in freshness.py as positional arguments. This would cause not the desired
`Note` event to be emitted, but instead get the message
```
[Note] Don't use positional arguments when constructing logging events
```
which was our fault, not the users'. Additionally, we were passing the
level for the event in the `Note` instantiation when we needed to be
passing it to the `fire_event` method.

* Raise error when `loaded_at_field` is `None` and metadata check isn't possible

Previously if a source freshness check didn't have a `loaded_at_field` and
metadata source freshness wasn't supported by the adapter, then we'd log
a warning message and let the source freshness check continue. This was problematic
because the source freshness check couldn't actually continue and the process
would raise an error in the form
```
type object argument after ** must be a mapping, not NoneType
```
because the `freshness` variable was never getting set. This error wasn't particularly
helpful for any person running into it. So instead of letting that error
happen we now deliberately raise an error with helpful information.

* Add test which ensures bad source freshness checks raise appropriate error

This test directly tests that when a source freshness check doesn't have a
`loaded_at_field` and the adapter in use doesn't support metadata checks,
then the appropriate error message gets raised. That is, it directly tests
the change made in a162d53. This test indirectly tests the changes in both
7ec2f82 and 7b0ff31 as the appropriate error can only be raised because
we've fixed other upstream issues via those commits.

* Add changelog entry for source freshness edgecase fixes
QMalcolm added a commit that referenced this pull request Mar 29, 2024
…heck is run or actionable information (#9825) (#9826)

* Fix assorted source freshness edgecases so check is run or actionable information (#9825)

* Ensure BaseRunner handles nodes without `build_path`

Some nodes, like SourceDefinition nodes, don't have a `build_path` property.
This is problematic because we take in nodes with no type checking, and
assume they have properties sometimes, like `build_path`. This was just
the case in BaseRunner's `_handle_generic_exception` and
`_handle_interal_exception` methods. Thus to stop dbt from crashing when
trying to handle an exception related to a node without a `build_path`,
we added an private method to the BaseRunner class for safely trying
to get `build_path`.

* Use keyword arguments when instantiating `Note` events in freshness.py

Previously we were passing arguments during the `Note` event instantiations
in freshness.py as positional arguments. This would cause not the desired
`Note` event to be emitted, but instead get the message
```
[Note] Don't use positional arguments when constructing logging events
```
which was our fault, not the users'. Additionally, we were passing the
level for the event in the `Note` instantiation when we needed to be
passing it to the `fire_event` method.

* Raise error when `loaded_at_field` is `None` and metadata check isn't possible

Previously if a source freshness check didn't have a `loaded_at_field` and
metadata source freshness wasn't supported by the adapter, then we'd log
a warning message and let the source freshness check continue. This was problematic
because the source freshness check couldn't actually continue and the process
would raise an error in the form
```
type object argument after ** must be a mapping, not NoneType
```
because the `freshness` variable was never getting set. This error wasn't particularly
helpful for any person running into it. So instead of letting that error
happen we now deliberately raise an error with helpful information.

* Add test which ensures bad source freshness checks raise appropriate error

This test directly tests that when a source freshness check doesn't have a
`loaded_at_field` and the adapter in use doesn't support metadata checks,
then the appropriate error message gets raised. That is, it directly tests
the change made in a162d53. This test indirectly tests the changes in both
7ec2f82 and 7b0ff31 as the appropriate error can only be raised because
we've fixed other upstream issues via those commits.

* Add changelog entry for source freshness edgecase fixes

* Update imports of dbt.artifacts to pre-dbt.artifacts locations

Moving things to dbt.artifacts started to happen AFTER the release of
dbt-core 1.7, thus 1.7 has no concept of dbt.articacts. The changes brought
in via the previous commit, were cherry-picked from main (1.8.0 beta) and
thus reference dbt.artifacts. This caused everything to blow up, because
dbt.artifacts doesn't exist. Here we've updated the dbt.artifacts imports
to their pre-dbt.artifacts paths.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[CT-3383][Bug] Metadata freshness checks that include a filter cause dbt to hang
2 participants