Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Updating "Best practices: Naming lables" #2469 #2549

Open
wants to merge 6 commits into
base: main
Choose a base branch
from

Conversation

Gopi-eng2202
Copy link

Hi @SuperQ and @juliusv,
cc @beorn7 ,
This pr is about the issue #8718.
I have updated the documentation on Best practices: Metric naming, based on the discussion from the mentioned issue#. I request you to have a look at it and let me know what you think.

Signed-off-by: Gopi-eng2202 [email protected]

* If the metric name has a unit, then the unit must be the last word before the accumulating count. Examples,
* <code>process\_cpu\_<b>seconds</b>\_<b>total</b></code>
* <code>request\_size\_<b>bytes</b>\_<b>total</b></code>
* If the metric name does'nt have a real unit, then the naming can follow any sorting order in a way that it fits your use case, such as `grouping similar metric names`. Examples,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I agree with this section. The example connections is a real countable unit.

However, we have agreed in the past that unit modifiers are prefered to be before, but are still allowed to be after. For example, both process_thing_failed_bytes_total and process_thing_bytes_failed_total are both valid.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand, i can try to rephrase that.

But with regards to your overall comment, I understood that only standalone units which are measurable, like "seconds", "bytes" are considered to be a real unit. In a different context "connections" could be a unit, but it does'nt represent a unit of measurement though.

Correct me if i am wrong, ill read the discussion for this issue again to see if i can rephrase these lines.

Thanks for the feedback, will get back with this issue soon.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understood the previous discussions such that we had reached the consensus that we won't consider things like connections a "real" unit for the sake of metric naming. (I'm not sure, though, if we should use the wording "real unit" in the doc here.) If my understanding is flawed, maybe @juliusv and @SuperQ could go back to the drawing board and find out what the consensus is.

Very generally, this document is just a recommendation anyway, so we aren't defining a spec here and maybe don't have to be super precise.

On the other hand, OpenMetrics in its current state specifies about the UNIT metadata field:

Unit specifies MetricFamily units. If non-empty, it MUST be a suffix of the MetricFamily name separated by an underscore. Be aware that further generation rules might make it an infix in the text format.

I'm not sure if we are ever going to enforce this in Prometheus (and the strict requirement might even be lifted in OM v2), but since the OM standard was evolved from existing best practices, it gives us some hints. I would believe the following to be true:

  • A "real" unit is one that would show up in the UNIT metadata field. (E.g. we would probably not have a line like # UNIT go_sql_stats_connections_idle connections in the OM exposition.)
  • Such a "real" unit really should be the last thing before any other suffix like _total (in OM, it's a MUST even).

So maybe that's what should go in here: Somehow explain what "real" units are (using a better wording than just saying "real unit"), and then have a very strong recommendation to put them last.

Co-authored-by: Ben Kochie <[email protected]>
Signed-off-by: gopi <[email protected]>
Copy link
Member

@beorn7 beorn7 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To take a step back:

The changes as proposed right now are in part trying to recommend even more strictly that the unit has to go last. But that's already in the guidelines, even without a change. Remember that the whole discussion started because people felt the rule should be expressed in a less strict way, because they saw "units" like "connections" not being last in widely used metric names. The discussion then turned more the other way by concluding that things like "connections" are not "real" units.

To summarize:

I think we need far fewer changes here. What I want to see is a good rule-of-thumb what a unit is in the narrow sense (maybe referring to the "base units" section further down, which essentially lists many relevant examples, notably excluding things like "connections") and then we don't really have to change much else. Maybe we could add a new paragraph about ordering the metric name components in a way that the lexicographic sorting makes most sense, and that then you should still put the "real" units last, but say that this doesn't apply to things like "connections".

@@ -39,6 +39,26 @@ A metric name...
(for a pseudo-metric that provides [metadata](https://www.robustperception.io/exposing-the-software-version-to-prometheus) about the running binary)
* <code>data\_pipeline\_last\_record\_processed\_<b>timestamp_seconds</b></code>
(for a timestamp that tracks the time of the latest record processed in a data processing pipeline)
* ...that has a `unit` in its name, should have that `unit` as the last word. Note that an accumulating count such as `total` will be the last word, in addition to the unit if applicable. Examples,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This partially repeats the paragraph above. Maybe we can just work the new information into the previous paragraph? (Note that the previous paragraph gets the indentation wrong. Maybe something we can fix now. I.e. all the example metric name should be indented to be in a sub-list.)

* If the metric name has a unit, then the unit must be the last word before the accumulating count. Examples,
* <code>process\_cpu\_<b>seconds</b>\_<b>total</b></code>
* <code>request\_size\_<b>bytes</b>\_<b>total</b></code>
* If the metric name does'nt have a real unit, then the naming can follow any sorting order in a way that it fits your use case, such as `grouping similar metric names`. Examples,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo: does'nt → doesn't

Comment on lines 58 to 61
* <code>net\_conntrack\_dialer\_conn\_total</code>
* <code>net\_conntrack\_dialer\_conn\_total\_failed</code>
* <code>net\_conntrack\_dialer\_conn\_total\_established</code>
* <code>net\_conntrack\_dialer\_conn\_total\_closed</code>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm pretty sure that we always want the _total last in a counter. So these examples should not be here.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, I'll work on this

Comment on lines 57 to 61
or
* <code>net\_conntrack\_dialer\_conn\_total</code>
* <code>net\_conntrack\_dialer\_conn\_total\_failed</code>
* <code>net\_conntrack\_dialer\_conn\_total\_established</code>
* <code>net\_conntrack\_dialer\_conn\_total\_closed</code>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this example is invalid.

Suggested change
or
* <code>net\_conntrack\_dialer\_conn\_total</code>
* <code>net\_conntrack\_dialer\_conn\_total\_failed</code>
* <code>net\_conntrack\_dialer\_conn\_total\_established</code>
* <code>net\_conntrack\_dialer\_conn\_total\_closed</code>

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it

Copy link
Author

@Gopi-eng2202 Gopi-eng2202 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @beorn7 & @SuperQ,
I have taken account of all the feedbacks and made changes accordingly.
Please have a look and let me know.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants