-
Notifications
You must be signed in to change notification settings - Fork 143
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Workaround for Malformed Events with page_location = '/' in fct_ga4__pages #311
Comments
Seems to me like the issue is with the test - at the very least it should include I'd like to hear more about why having a page_location set to |
Hi @adamribaudo-velir, thanks for your quick reply. Usually, the |
@erikverheij oh ok. For some reason I thought Regardless, the faulty data has already been collected so I think your best bet is to set the severity of the error you noticed to |
We had an issue with multi-site related to this that was recently fixed by @yamotech. That test should be
However, that would not cause the This looks to me to be a data collection problem like maybe someone decided to override the |
Hi @dgitis, thx for the suggestion.
Looks something like that indeed.. Is there a way to hook into the flow somewhere at the start or even before the start to remove these entries before transforming with this package? |
The failure is at The package is designed for transformation that happen to all events get put in What you should do is override the To do this, you'll need to create a Then you'll need to disable the package version in your
That code was done from memory, so it may not be exactly correct. |
Just an idea @erikverheij , but you could override the 'base select' macro to insert your own logic for parsing Only slightly cleaner, but thought I'd throw it out there. |
Thanks guys for providing some ways to workaround it! That will do the trick for me for now. It would be very nice if there's a way to do some filtering before everything starts, without the need to override logic from the package. I'm not experienced in DBT, but perhaps something like this would be possible?: Adding a placeholder macro with a Note, that this is only a suggestion for the nice-to-have list as the workaround works for me. Such a feature would make it easier to stay aligned with the logic from this repo. |
I've definitely thought about adding a custom SQL variable that, if set, adds a |
I've encountered an issue with some events in the
base_ga4__events
model where page_location is set to '/' instead of containing a full URL. This issue appears across multiple GA4 properties and leads to failures in the tests defined forfct_ga4__pages
, specifically the uniqueness test on the combination ofpage_location
andevent_date_dt
.Here's the test that's failing:
The failure message is:
To resolve this issue, I'm considering a workaround where records with
page_location = '/'
are removed at an early stage in the transformation process. However, I'm unsure of the best approach to implement this workaround without diverging significantly from the package's intended usage and maintaining upgradeability.Could you provide guidance on how to best address malformed page_location values within the framework of the package? Is there a recommended approach for filtering out or correcting these records before they impact downstream models and tests?
Thank you for your assistance.
The text was updated successfully, but these errors were encountered: