Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix returning nulls from /data/views/query if no data in BigQuery #177

Closed
wants to merge 1 commit into from

Conversation

leszko
Copy link
Contributor

@leszko leszko commented Jan 3, 2024

Before this PR, if BigQuery returns null in some of the columns, then a call to /data/views/query returns an error:

"bigquery error: error reading query result: bigquery: NULL cannot be assigned to field `PlaytimeMins` of type float64"

After this change, it will return the following:

[
    {
        "viewCount": 0,
        "playtimeMins": null,
        "ttffMs": null,
        "rebufferRatio": null,
        "errorRate": null,
        "exitsBeforeStart": null
    }
]

Related to https://linear.app/livepeer/issue/PS-191/grafana-engagement-data-issues, though it does no solve the root cause of that issue.

@leszko leszko requested a review from a team as a code owner January 3, 2024 15:04
Copy link

linear bot commented Jan 3, 2024

@leszko leszko requested a review from victorges January 3, 2024 15:04
Copy link

codecov bot commented Jan 3, 2024

Codecov Report

Attention: 7 lines in your changes are missing coverage. Please review.

Comparison is base (590d298) 28.00000% compared to head (bba6b97) 27.82875%.

Additional details and impacted files

Impacted file tree graph

@@                 Coverage Diff                 @@
##                main        #177         +/-   ##
===================================================
- Coverage   28.00000%   27.82875%   -0.17125%     
===================================================
  Files              4           4                 
  Lines            325         327          +2     
===================================================
  Hits              91          91                 
- Misses           223         225          +2     
  Partials          11          11                 
Files Coverage Δ
views/bigquery.go 49.32432% <ø> (ø)
views/prometheus.go 0.00000% <ø> (ø)
views/client.go 0.00000% <0.00000%> (ø)

Continue to review full report in Codecov by Sentry.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 590d298...bba6b97. Read the comment docs.

Files Coverage Δ
views/bigquery.go 49.32432% <ø> (ø)
views/prometheus.go 0.00000% <ø> (ø)
views/client.go 0.00000% <0.00000%> (ø)

Copy link
Member

@victorges victorges left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM in general, but I think we should never return null for the basic viewership fields (viewcount and playtime)

And thanks for picking this one up :)

Comment on lines +83 to +84
ViewCount bigquery.NullInt64 `bigquery:"view_count"`
PlaytimeMins bigquery.NullFloat64 `bigquery:"playtime_mins"`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any idea how these can ever be null?

Not that I think we shouldn't make this fix, but these are the most basic fields that should always be there, defaulting to 0, so it kinda looks like a data pipeline problem.

Comment on lines +190 to +191
ViewCount: toInt64Ptr(row.ViewCount, spec.Detailed),
PlaytimeMins: toFloat64Ptr(row.PlaytimeMins, spec.Detailed),
Copy link
Member

@victorges victorges Jan 3, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These 2 fields are actually not detailed, they belong to the basic viewership metrics and should always be in the response. So they are always "asked".

Suggested change
ViewCount: toInt64Ptr(row.ViewCount, spec.Detailed),
PlaytimeMins: toFloat64Ptr(row.PlaytimeMins, spec.Detailed),
ViewCount: toInt64Ptr(row.ViewCount, true),
PlaytimeMins: toFloat64Ptr(row.PlaytimeMins, true),

(otherwise if you call the API without ?detailed=true you will get NO metrics on the response!)

Comment on lines +190 to +191
ViewCount: toInt64Ptr(row.ViewCount, spec.Detailed),
PlaytimeMins: toFloat64Ptr(row.PlaytimeMins, spec.Detailed),
Copy link
Member

@victorges victorges Jan 3, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thinking further about this, we probably don't want to return null on the API responses for these fields, as we shouldn't leak the datasource bug to the end-user. So IMO these fields should always be present and instead we return 0 if they are null, WDYT? We could fix this directly on the SQL queries, like adding a COALESCE on those fields, or we could just have a logic here to check v == null ? 0 : v.

@leszko
Copy link
Contributor Author

leszko commented Jan 4, 2024

Before this PR, if BigQuery returns null in some of the columns, then a call to /data/views/query returns an error:

"bigquery error: error reading query result: bigquery: NULL cannot be assigned to field `PlaytimeMins` of type float64"

After this change, it will return the following:

[
    {
        "viewCount": 0,
        "playtimeMins": null,
        "ttffMs": null,
        "rebufferRatio": null,
        "errorRate": null,
        "exitsBeforeStart": null
    }
]

Related to https://linear.app/livepeer/issue/PS-191/grafana-engagement-data-issues, though it does no solve the root cause of that issue.

I think you're right, we should return 0, not null. I'm closing this PR and opened a new PR with the select query updated. PTAL: #178

@leszko leszko closed this Jan 4, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants