Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question arising from analysis of collector/utils.go #124

Closed
KacperPerschke opened this issue Jan 9, 2024 · 10 comments
Closed

Question arising from analysis of collector/utils.go #124

KacperPerschke opened this issue Jan 9, 2024 · 10 comments

Comments

@KacperPerschke
Copy link
Contributor

The real question is at the end.

Introduction

Attempts to analyze the bytesConverter and removeCommas functions contained in the collector/utils.go file led me to the conclusion that:

  1. The artifactory string containing the number is fragmented.
  2. The first fragment is treated as a number.
    • Remember that slices are indexed from 0.
  3. The second fragment is interpreted as a multiplier.
    • There are some minor inconsistencies.
  4. The remaining fragments are ignored.

My desire

I would like the code to show what it does a little more clearly, and the comments to possibly explain why such decisions were made.

Data

During testing for #121, we collected some responses from artifactory and managed to group them into the cases below.

  • Pure number with comma as thousands separator.
    • [0-9],[0-9][0-9][0-9],[0-9][0-9][0-9]
  • An integer, or with a decimal fraction and a multiplier not separated by a space.
    • [0-9]GB
    • [0-9][0-9]GB
    • [0-9][0-9][0-9]GB
    • [0-9]TB
    • [0-9]%
    • [0-9].[0-9]%
    • [0-9].[0-9][0-9]%
    • [0-9][0-9].[0-9]%
    • [0-9][0-9].[0-9][0-9]%
  • Number without or with a decimal. Multiplier separated by a space.
    • [0-9] bytes
    • [0-9][0-9] bytes
    • [0-9].[0-9][0-9] KB
    • [0-9].[0-9][0-9] MB
    • [0-9].[0-9][0-9] GB
    • [0-9].[0-9][0-9] TB
    • [0-9][0-9].[0-9][0-9] KB
    • [0-9][0-9].[0-9][0-9] MB
    • [0-9][0-9].[0-9][0-9] GB
    • [0-9][0-9].[0-9][0-9] TB
    • [0-9][0-9][0-9].[0-9][0-9] KB
    • [0-9][0-9][0-9].[0-9][0-9] MB
    • [0-9][0-9][0-9].[0-9][0-9] GB
  • A flower in a buttonhole.
    • [0-9].[0-9][0-9] TB ([0-9].[0-9]%)
    • [0-9].[0-9][0-9] TB ([0-9].[0-9][0-9]%)
    • [0-9][0-9].[0-9][0-9] TB ([0-9][0-9].[0-9]%)
    • [0-9][0-9].[0-9][0-9] TB ([0-9][0-9].[0-9][0-9]%)

Proposal of a new algorithm.

  1. We use regexp to recognize the case and split it into elements.
  2. If we encounter a case not covered by regexp, we return an error.
  3. If the multiplier is %, we convert number to the range 0–1.
  4. If we have the last case, we return both numbers.

The Question

Do we stay with the current algorithm or switch to a new one?

Copy link

github-actions bot commented Feb 9, 2024

This issue is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 5 days.

@github-actions github-actions bot added the Stale label Feb 9, 2024
@peimanja peimanja removed the Stale label Feb 9, 2024
@KacperPerschke
Copy link
Contributor Author

This is a green light request. I'll be happy to implement it.

But I'm not sure whether this idea fits the program author's concept. Hence I prefer to ask.

@peimanja
Copy link
Owner

@KacperPerschke Thanks for the observation and putting this together. I would be open to switch to the new model. I might have time this weekend to look at some of the overdue issues and PRs but yeah feel free to raise a PR to switch to the new model

@KacperPerschke
Copy link
Contributor Author

KacperPerschke commented Feb 18, 2024

A proposition that I haven't yet integrated with the rest of the program. That's why I don't even push to my fork repo.

package collector

import (
	"fmt"
	"regexp"
	"strconv"
	"strings"
)

const (
	mulPercent = 0.01
	mulKB      = 1024
	mulMB      = mulKB * 1024
	mulGB      = mulMB * 1024
	mulTB      = mulGB * 1024
)

// For Data Driven Development see https://github.com/peimanja/artifactory_exporter/issues/124
const (
	pattNumThouComma  = `^(?P<number>[[:digit:]]{1,3}(?:,[[:digit:]]{3})*)$`
	pattNumMult       = `^(?P<number>[[:digit:]]+(?:\.[[:digit:]]{1,2})?)(?P<multiplier>%|[GT]B)$`
	pattNumSpaceMult  = `^(?P<number>[[:digit:]]+(?:\.[[:digit:]]{1,2})?) (?P<multiplier>bytes|[KMGT]B)$`
	pattTBytesPercent = `^(?P<tbytes>[[:digit:]]+(?:\.[[:digit:]]{1,2})?) TB \((?P<percent>[[:digit:]]{1,2}(?:\.[[:digit:]]{1,2})?)%\)$`
)

var (
	reNumThouComma  = regexp.MustCompile(pattNumThouComma)
	reNumMult       = regexp.MustCompile(pattNumMult)
	reNumSpaceMult  = regexp.MustCompile(pattNumSpaceMult)
	reTBytesPercent = regexp.MustCompile(pattTBytesPercent)
)

type convResult []float64

func (e *Exporter) convArtiNumberToEpoch(artiNum string) (convResult, error) {
	emptyResult := convResult{}
	e.logger.Debug(
		"Attempting to convert a string from artifactory representing a number.",
		"artifactory.number", artiNum,
	)
	switch {
	case reNumThouComma.MatchString(artiNum):
		res := extractNamedGroups(artiNum, reNumThouComma)
		f, err := e.convNumber(
			strings.Replace(res["number"], `,`, ``, -1),
		)
		if err != nil {
			return emptyResult, err
		}
		return convResult{f}, nil
	case reNumMult.MatchString(artiNum):
		res := extractNamedGroups(artiNum, reNumMult)
		f, err := e.convNumber(res["number"])
		if err != nil {
			return emptyResult, err
		}
		m, err := e.convMultiplier(res["multiplier"])
		if err != nil {
			return emptyResult, err
		}
		return convResult{f * m}, nil
	case reNumSpaceMult.MatchString(artiNum):
		res := extractNamedGroups(artiNum, reNumSpaceMult)
		f, err := e.convNumber(res["number"])
		if err != nil {
			return emptyResult, err
		}
		m, err := e.convMultiplier(res["multiplier"])
		if err != nil {
			return emptyResult, err
		}
		return convResult{f * m}, nil
	case reTBytesPercent.MatchString(artiNum):
		res := extractNamedGroups(artiNum, reTBytesPercent)
		b, err := e.convNumber(res["tbytes"])
		if err != nil {
			return emptyResult, err
		}
		p, err := e.convNumber(res["percent"])
		if err != nil {
			return emptyResult, err
		}
		return convResult{b * mulTB, p * mulPercent}, nil
	default:
		e.logger.Debug(
			"The arti number did not match known templates.",
			"artifactory.number", artiNum,
		)
		err := fmt.Errorf(`The string '%s' does not match any known patterns.`, artiNum)
		return emptyResult, err
	}
}

func (e *Exporter) convMultiplier(m string) (float64, error) {
	switch {
	case m == `%`:
		return mulPercent, nil
	case m == `bytes`:
		return 1, nil
	case m == `KB`:
		return mulKB, nil
	case m == `MB`:
		return mulMB, nil
	case m == `GB`:
		return mulGB, nil
	case m == `TB`:
		return mulTB, nil
	default:
		e.logger.Error(
			"The string was not recognized as a known multiplier.",
			"artifactory.number.multiplier", m,
		)
		return 0, fmt.Errorf(`Could not recognise '%s; as multiplier`, m)
	}
}

func (e *Exporter) convNumber(n string) (float64, error) {
	f, err := strconv.ParseFloat(n, 64)
	if err != nil {
		e.logger.Error(
			"String not convertible to float64",
			"string", n,
		)
		return 0, err
	}
	return f, nil
}

type capturedGroups map[string]string

func extractNamedGroups(artiNum string, re *regexp.Regexp) capturedGroups {
	match := re.FindStringSubmatch(artiNum)
	list := make(capturedGroups)
	for i, name := range re.SubexpNames() {
		if i != 0 && name != "" {
			list[name] = match[i]
		}
	}
	return list
}

Please comment, does this lead the next developer by the hand?

KacperPerschke added a commit to KacperPerschke/artifactory_exporter that referenced this issue Feb 24, 2024
An attempt to clean up the code
that converts Artifactory strings
to numbers accepted in Prometheus.
KacperPerschke added a commit to KacperPerschke/artifactory_exporter that referenced this issue Feb 24, 2024
An attempt to clean up the code
that converts Artifactory strings
to numbers accepted in Prometheus.
KacperPerschke added a commit to KacperPerschke/artifactory_exporter that referenced this issue Feb 24, 2024
An attempt to clean up the code
that converts Artifactory strings
to numbers accepted in Prometheus.
@KacperPerschke
Copy link
Contributor Author

Malice towards Java programmers — experience with artifactory and keycloak leads me to believe that inconsistency is their sacred duty.

Now in to meritum. Collecting further cases brought me to a conclusion. The ways in which the artifactory api gives numbers can be divided into two groups.

  • One number with or without a multiplier.
    [0-9] bytes
    [0-9]%
    [0-9],[0-9][0-9][0-9],[0-9][0-9][0-9]
    [0-9],[0-9][0-9][0-9].[0-9][0-9] MB
    [0-9].[0-9]%
    [0-9].[0-9][0-9] GB
    [0-9].[0-9][0-9] KB
    [0-9].[0-9][0-9] MB
    [0-9].[0-9][0-9] TB
    [0-9].[0-9][0-9]%
    [0-9]GB
    [0-9]TB
    [0-9][0-9] bytes
    [0-9][0-9].[0-9]%
    [0-9][0-9].[0-9][0-9] GB
    [0-9][0-9].[0-9][0-9] KB
    [0-9][0-9].[0-9][0-9] MB
    [0-9][0-9].[0-9][0-9] TB
    [0-9][0-9].[0-9][0-9]%
    [0-9][0-9]GB
    [0-9][0-9][0-9].[0-9][0-9] GB
    [0-9][0-9][0-9].[0-9][0-9] KB
    [0-9][0-9][0-9].[0-9][0-9] MB
    [0-9][0-9][0-9]GB
  • Two numbers formatted in a specific way.
    [0-9].[0-9][0-9] TB ([0-9].[0-9]%)
    [0-9].[0-9][0-9] TB ([0-9].[0-9][0-9]%)
    [0-9][0-9].[0-9][0-9] TB ([0-9][0-9].[0-9]%)
    [0-9][0-9].[0-9][0-9] TB ([0-9][0-9].[0-9][0-9]%)

#135 follows.

peimanja pushed a commit that referenced this issue Mar 1, 2024
* Issue #124.

An attempt to clean up the code
that converts Artifactory strings
to numbers accepted in Prometheus.

* Unit tests plus sonarcloud suggestion.

* Names unification.
@KacperPerschke
Copy link
Contributor Author

@peimanja thank you for approving #135, but we should be aware of the effect of this code change.

I managed to check in Prometheus that the values representing percentages have changed from [0, 100] to [0.00, 1.00]. Please take into account this information.

I don't know Grafana well enough to suggest a change in the configuration of the dashboard supplied with the program. For this reason, I neglected to make a necessary correction. I'm sorry.

@peimanja
Copy link
Owner

peimanja commented Mar 1, 2024

no problem, we definitely want to avoid introducing a breaking change like that. So we should probably either revert that before the next release or fix it if possible.

Unfortunately I've been quite busy and don't have much time other than some weekends to take a look

@KacperPerschke
Copy link
Contributor Author

I've changed grafana dashboard at my workplace.
It was necessary to change "unit": "percent" to "unit": "percentunit".
But I can't find "unit": "percent" in dashboard.json file.

@peimanja
Copy link
Owner

is it this ?

@KacperPerschke
Copy link
Contributor Author

is it this ?

I'm sorry, but my knowledge about grafana is too limited to be able to answer this question clearly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants