Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for Percentile Stats lines #11

Draft
wants to merge 3 commits into
base: main
Choose a base branch
from
Draft

Conversation

aleroyer
Copy link
Collaborator

@aleroyer aleroyer commented Feb 18, 2023

This is a first attempt in supporting Percentile Stats.

Not entirely satisfied for now as the exporter doesn't support multiple labels per time series (this will need bigger changes) but this should be sufficient to fix #10.

Samples:

rsyslog_percentile_global{counter="host_statistics.new_metric_add"} 1
rsyslog_percentile_global{counter="host_statistics.ops_overflow"} 0
rsyslog_percentile_global{counter="my_stats.new_metric_add"} 0
rsyslog_percentile_global{counter="my_stats.ops_overflow"} 0
rsyslog_host_statistics_percentile_bucket{bucket="test_stat|p50"} 0
rsyslog_host_statistics_percentile_bucket{bucket="test_stat|p95"} 0
rsyslog_host_statistics_percentile_bucket{bucket="test_stat|p99"} 0
rsyslog_host_statistics_percentile_bucket{bucket="test_stat|window_count"} 6
rsyslog_host_statistics_percentile_bucket{bucket="test_stat|window_max"} 0
rsyslog_host_statistics_percentile_bucket{bucket="test_stat|window_min"} 0
rsyslog_host_statistics_percentile_bucket{bucket="test_stat|window_sum"} 0

@aleroyer aleroyer force-pushed the feat-percentilestat branch from 4e17db7 to e579ec0 Compare February 18, 2023 18:58
Signed-off-by: Antoine Leroyer <[email protected]>
@aleroyer aleroyer marked this pull request as ready for review February 18, 2023 19:42
@aleroyer aleroyer added the enhancement New feature or request label Feb 18, 2023
@matthiasr
Copy link

The data provided reasonably resembles the shape of a Prometheus summary, I think the exporter should present them as such. Some tallying and conversion will be required but I don't see a blocker right now

@matthiasr
Copy link

Also as a general terminology note, "bucket" is strongly associated with histograms, not quantiles. I would like to avoid mixing these to avoid confusion

@aleroyer
Copy link
Collaborator Author

Thanks for the feedback @matthiasr.

I think the exporter should present them as such. Some tallying and conversion will be required but I don't see a blocker right now

I agree with summary, forgot about this type!
For the conversion part, it shouldn't be hard. I think I might get issues with the render of such metrics as this exporter is using a point struct that only support 1 label. Having a summary means having more than 1 label (quantile, name and stat maybe). I will check how to implement this gradually.

For example, I can rename this

rsyslog_host_statistics_percentile_bucket{bucket="test_stat|p50"} 0
rsyslog_host_statistics_percentile_bucket{bucket="test_stat|p95"} 0
rsyslog_host_statistics_percentile_bucket{bucket="test_stat|p99"} 0
rsyslog_host_statistics_percentile_bucket{bucket="test_stat|window_count"} 6
rsyslog_host_statistics_percentile_bucket{bucket="test_stat|window_max"} 0
rsyslog_host_statistics_percentile_bucket{bucket="test_stat|window_min"} 0
rsyslog_host_statistics_percentile_bucket{bucket="test_stat|window_sum"} 0

into this

# summary type
rsyslog_percentile_stats{name="host_statistics", quantile="0.50", stat="test_stat"} 0
rsyslog_percentile_stats{name="host_statistics", quantile="0.95", stat="test_stat"} 0
rsyslog_percentile_stats{name="host_statistics", quantile="0.99", stat="test_stat"} 0
rsyslog_percentile_stats_count{name="host_statistics", stat="test_stat"} 6
rsyslog_percentile_stats_sum{name="host_statistics", stat="test_stat"} 0
# gauge type? I don't think they can be included in summary
rsyslog_percentile_stats_min{name="host_statistics", stat="test_stat"} 0
rsyslog_percentile_stats_max{name="host_statistics", stat="test_stat"} 0

However, count and sum here might be wrong and will not respect Prometheus summary.

As stated in Prometheus documentation:

While it also provides a total count of observations and a sum of all observed values, it calculates configurable quantiles over a sliding time window.
[...]

  • the total sum of all observed values, exposed as _sum
  • the count of events that have been observed, exposed as _count

But rsyslog here is exposing the sum and count for the window size specified by the user in rsyslog.conf. Not all observed values during the lifetime of the process.

msg_per_host|window_sum: sum of recorded values in the statistical population window.
msg_per_host|window_count: count of recorded values in the statistical population window.

Will this be an issue? If yes, I will need to do the sum/count part in the exporter but I don't think I can obtain something accurate.

@matthiasr
Copy link

Yes they need to be cumulative. You can create a complete summary with NewConstSummary, but you will need the aggregated sum and count for that. The way I am used to doing that is to hold the values in a separate (map of) struct(s), then implement the custom collector interface to only construct the actual metric objects at scrape time

How does the window relate to the reporting interval? If we get the count for the window every time at the end of the window, summing them all up would be "good enough". If somehow an increment gets lost, nothing we can do about it. OTOH if the exporter restarts and the count starts from zero again, that's OK as long as it doesn't happen too frequently.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

error handling stats line - after update to rsyslog-8.2204
2 participants