Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Miovision: discrepencies between volumes_15min_mvt and unacceptable_gaps #868

Open
gabrielwol opened this issue Feb 12, 2024 · 1 comment · May be fixed by #1024 or #869
Open

Miovision: discrepencies between volumes_15min_mvt and unacceptable_gaps #868

gabrielwol opened this issue Feb 12, 2024 · 1 comment · May be fixed by #1024 or #869

Comments

@gabrielwol
Copy link
Collaborator

          Here are 12 dates with records that should be nulls but are not. You were lucky to pick one of them at random! I will create a new issue for this investigation. 
"datetime_bin" "count"
"2024-02-05" 46
"2024-02-01" 43
"2024-01-20" 3990
"2024-01-19" 4032
"2024-01-18" 4032
"2024-01-17" 4032
"2024-01-16" 4032
"2024-01-15" 4032
"2024-01-14" 4032
"2024-01-13" 4032
"2024-01-02" 44
"2023-01-22" 42
SELECT datetime_bin::date, COUNT(*)
FROM miovision_api.volumes_15min_mvt AS v
JOIN miovision_api.unacceptable_gaps AS un USING (intersection_uid, datetime_bin)
WHERE v.volume IS NOT NULL
GROUP BY 1
ORDER BY 1 DESC

Originally posted by @gabrielwol in #863 (comment)

@gabrielwol
Copy link
Collaborator Author

gabrielwol commented Feb 12, 2024

All these discrepencies were caused by 5 unacceptable_gap entries:

"dt" "intersection_uid" "gap_start" "gap_end"
"2023-01-22" 15 "2023-01-22 23:57:00" "2023-01-23 00:04:00"
"2024-01-02" 60 "2024-01-02 10:21:00" "2024-01-03 00:00:00"
"2024-01-12" 6 "2024-01-13 00:00:00" "2024-01-20 23:45:00"
"2024-02-02" 7 "2024-02-01 23:45:00" "2024-02-02 22:57:00"
"2024-02-05" 41 "2024-02-05 23:52:00" "2024-02-06 00:17:00"
SELECT DISTINCT un.dt, un.intersection_uid, un.gap_start, un.gap_end
FROM miovision_api.volumes_15min_mvt AS v
JOIN miovision_api.unacceptable_gaps AS un USING (intersection_uid, datetime_bin)
WHERE v.volume IS NOT NULL
    AND v.datetime_bin >= '2023-01-02' AND v.datetime_bin < '2024-02-06'

The first/fourth/fifth record above are because the find_gaps script identifies gaps that overlap midnight, but started the day before. I didn't anticipate that this would make the pipeline non-idempotent. I will remove this behaviour now at the minor expense of missing small gaps that start over midnight.

The other two issues I can't think of a cause for! I can re-aggregate these 2 ranges and we can keep our eyes out for similar issues in the future.

gabrielwol added a commit that referenced this issue Feb 12, 2024
remove 15 minute buffer + 
change to adding artificial point at each day
gabrielwol added a commit that referenced this issue Feb 12, 2024
@gabrielwol gabrielwol linked a pull request Feb 12, 2024 that will close this issue
4 tasks
@gabrielwol gabrielwol linked a pull request Feb 12, 2024 that will close this issue
4 tasks
@gabrielwol gabrielwol self-assigned this Jun 13, 2024
@gabrielwol gabrielwol added this to the Miovision pipeline updates milestone Jul 4, 2024
gabrielwol added a commit that referenced this issue Jul 26, 2024
gabrielwol added a commit that referenced this issue Jul 26, 2024
gabrielwol added a commit that referenced this issue Jul 26, 2024
gabrielwol added a commit that referenced this issue Jul 26, 2024
gabrielwol added a commit that referenced this issue Jul 26, 2024
gabrielwol added a commit that referenced this issue Jul 26, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment