-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Updating "Best practices: Naming lables" #2469 #2549
base: main
Are you sure you want to change the base?
Conversation
Signed-off-by: Gopi-eng2202 <[email protected]>
Signed-off-by: Gopi-eng2202 <[email protected]>
content/docs/practices/naming.md
Outdated
* If the metric name has a unit, then the unit must be the last word before the accumulating count. Examples, | ||
* <code>process\_cpu\_<b>seconds</b>\_<b>total</b></code> | ||
* <code>request\_size\_<b>bytes</b>\_<b>total</b></code> | ||
* If the metric name does'nt have a real unit, then the naming can follow any sorting order in a way that it fits your use case, such as `grouping similar metric names`. Examples, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure I agree with this section. The example connections
is a real countable unit.
However, we have agreed in the past that unit modifiers are prefered to be before, but are still allowed to be after. For example, both process_thing_failed_bytes_total
and process_thing_bytes_failed_total
are both valid.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I understand, i can try to rephrase that.
But with regards to your overall comment, I understood that only standalone units which are measurable, like "seconds", "bytes" are considered to be a real unit. In a different context "connections" could be a unit, but it does'nt represent a unit of measurement though.
Correct me if i am wrong, ill read the discussion for this issue again to see if i can rephrase these lines.
Thanks for the feedback, will get back with this issue soon.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I understood the previous discussions such that we had reached the consensus that we won't consider things like connections
a "real" unit for the sake of metric naming. (I'm not sure, though, if we should use the wording "real unit" in the doc here.) If my understanding is flawed, maybe @juliusv and @SuperQ could go back to the drawing board and find out what the consensus is.
Very generally, this document is just a recommendation anyway, so we aren't defining a spec here and maybe don't have to be super precise.
On the other hand, OpenMetrics in its current state specifies about the UNIT metadata field:
Unit specifies MetricFamily units. If non-empty, it MUST be a suffix of the MetricFamily name separated by an underscore. Be aware that further generation rules might make it an infix in the text format.
I'm not sure if we are ever going to enforce this in Prometheus (and the strict requirement might even be lifted in OM v2), but since the OM standard was evolved from existing best practices, it gives us some hints. I would believe the following to be true:
- A "real" unit is one that would show up in the UNIT metadata field. (E.g. we would probably not have a line like
# UNIT go_sql_stats_connections_idle connections
in the OM exposition.) - Such a "real" unit really should be the last thing before any other suffix like
_total
(in OM, it's a MUST even).
So maybe that's what should go in here: Somehow explain what "real" units are (using a better wording than just saying "real unit"), and then have a very strong recommendation to put them last.
Co-authored-by: Ben Kochie <[email protected]> Signed-off-by: gopi <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To take a step back:
The changes as proposed right now are in part trying to recommend even more strictly that the unit has to go last. But that's already in the guidelines, even without a change. Remember that the whole discussion started because people felt the rule should be expressed in a less strict way, because they saw "units" like "connections" not being last in widely used metric names. The discussion then turned more the other way by concluding that things like "connections" are not "real" units.
To summarize:
I think we need far fewer changes here. What I want to see is a good rule-of-thumb what a unit is in the narrow sense (maybe referring to the "base units" section further down, which essentially lists many relevant examples, notably excluding things like "connections") and then we don't really have to change much else. Maybe we could add a new paragraph about ordering the metric name components in a way that the lexicographic sorting makes most sense, and that then you should still put the "real" units last, but say that this doesn't apply to things like "connections".
content/docs/practices/naming.md
Outdated
@@ -39,6 +39,26 @@ A metric name... | |||
(for a pseudo-metric that provides [metadata](https://www.robustperception.io/exposing-the-software-version-to-prometheus) about the running binary) | |||
* <code>data\_pipeline\_last\_record\_processed\_<b>timestamp_seconds</b></code> | |||
(for a timestamp that tracks the time of the latest record processed in a data processing pipeline) | |||
* ...that has a `unit` in its name, should have that `unit` as the last word. Note that an accumulating count such as `total` will be the last word, in addition to the unit if applicable. Examples, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This partially repeats the paragraph above. Maybe we can just work the new information into the previous paragraph? (Note that the previous paragraph gets the indentation wrong. Maybe something we can fix now. I.e. all the example metric name should be indented to be in a sub-list.)
content/docs/practices/naming.md
Outdated
* If the metric name has a unit, then the unit must be the last word before the accumulating count. Examples, | ||
* <code>process\_cpu\_<b>seconds</b>\_<b>total</b></code> | ||
* <code>request\_size\_<b>bytes</b>\_<b>total</b></code> | ||
* If the metric name does'nt have a real unit, then the naming can follow any sorting order in a way that it fits your use case, such as `grouping similar metric names`. Examples, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Typo: does'nt → doesn't
content/docs/practices/naming.md
Outdated
* <code>net\_conntrack\_dialer\_conn\_total</code> | ||
* <code>net\_conntrack\_dialer\_conn\_total\_failed</code> | ||
* <code>net\_conntrack\_dialer\_conn\_total\_established</code> | ||
* <code>net\_conntrack\_dialer\_conn\_total\_closed</code> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm pretty sure that we always want the _total
last in a counter. So these examples should not be here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok, I'll work on this
content/docs/practices/naming.md
Outdated
or | ||
* <code>net\_conntrack\_dialer\_conn\_total</code> | ||
* <code>net\_conntrack\_dialer\_conn\_total\_failed</code> | ||
* <code>net\_conntrack\_dialer\_conn\_total\_established</code> | ||
* <code>net\_conntrack\_dialer\_conn\_total\_closed</code> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this example is invalid.
or | |
* <code>net\_conntrack\_dialer\_conn\_total</code> | |
* <code>net\_conntrack\_dialer\_conn\_total\_failed</code> | |
* <code>net\_conntrack\_dialer\_conn\_total\_established</code> | |
* <code>net\_conntrack\_dialer\_conn\_total\_closed</code> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it
Signed-off-by: Gopi-eng2202 <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @SuperQ and @juliusv,
cc @beorn7 ,
This pr is about the issue #8718.
I have updated the documentation on Best practices: Metric naming, based on the discussion from the mentioned issue#. I request you to have a look at it and let me know what you think.
Signed-off-by: Gopi-eng2202 [email protected]