Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cleanup(sinsp/metrics): assorted cleanups for robustness #1950

Merged
merged 3 commits into from
Aug 20, 2024

Conversation

incertum
Copy link
Contributor

@incertum incertum commented Jul 5, 2024

What type of PR is this?

Uncomment one (or more) /kind <> lines:

/kind bug

/kind cleanup

/kind design

/kind documentation

/kind failing-test

/kind feature

Any specific area of the project related to this PR?

Uncomment one (or more) /area <> lines:

/area API-version

/area build

/area CI

/area driver-kmod

/area driver-bpf

/area driver-modern-bpf

/area libscap-engine-bpf

/area libscap-engine-gvisor

/area libscap-engine-kmod

/area libscap-engine-modern-bpf

/area libscap-engine-nodriver

/area libscap-engine-noop

/area libscap-engine-source-plugin

/area libscap-engine-savefile

/area libscap

/area libpman

/area libsinsp

/area tests

/area proposals

Does this PR require a change in the driver versions?

/version driver-API-version-major

/version driver-API-version-minor

/version driver-API-version-patch

/version driver-SCHEMA-version-major

/version driver-SCHEMA-version-minor

/version driver-SCHEMA-version-patch

What this PR does / why we need it:

Which issue(s) this PR fixes:

Fixes #

Special notes for your reviewer:

Does this PR introduce a user-facing change?:

NONE

@incertum
Copy link
Contributor Author

incertum commented Jul 5, 2024

CC @Issif @sgaist @FedeDP

@incertum
Copy link
Contributor Author

incertum commented Jul 5, 2024

/milestone 0.18.0

Copy link

github-actions bot commented Jul 5, 2024

Perf diff from master - unit tests

    11.92%     -1.18%  [.] sinsp_parser::reset
     1.00%     +0.74%  [.] sinsp::fetch_next_event
     7.16%     +0.74%  [.] sinsp::next
     4.64%     -0.69%  [.] next
     4.10%     -0.67%  [.] gzfile_read
     3.90%     -0.64%  [.] sinsp_evt::load_params
     2.00%     -0.63%  [.] std::_Hashtable<long, std::pair<long const, std::shared_ptr<sinsp_threadinfo> >, std::allocator<std::pair<long const, std::shared_ptr<sinsp_threadinfo> > >, std::__detail::_Select1st, std::equal_to<long>, std::hash<long>, std::__detail::_Mod_range_hashing, std::__detail::_Default_ranged_hash, std::__detail::_Prime_rehash_policy, std::__detail::_Hashtable_traits<false, false, true> >::_M_find_before_node
     0.74%     +0.60%  [.] sinsp_parser::event_cleanup
     2.15%     +0.57%  [.] scap_event_decode_params
     0.70%     +0.55%  [.] sinsp_parser::parse_context_switch

Perf diff from master - scap file

    10.08%     -3.82%  [.] sinsp_filter_check::tostring
     3.36%     +3.63%  [.] sinsp_filter_check_event::extract_single
     3.34%     +3.20%  [.] sinsp_parser::process_event
     3.21%     +2.51%  [.] sinsp_filter_check::get_field_info
     6.65%     -2.01%  [.] sinsp_utils::ts_to_string
     3.27%     +1.81%  [.] sinsp_filter_check::apply_transformers
     3.31%     +1.38%  [.] std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::_M_construct<char const*>
     6.72%     -0.73%  [.] std::_Hashtable<long, std::pair<long const, std::shared_ptr<sinsp_threadinfo> >, std::allocator<std::pair<long const, std::shared_ptr<sinsp_threadinfo> > >, std::__detail::_Select1st, std::equal_to<long>, std::hash<long>, std::__detail::_Mod_range_hashing, std::__detail::_Default_ranged_hash, std::__detail::_Prime_rehash_policy, std::__detail::_Hashtable_traits<false, false, true> >::_M_find_before_node
    13.36%     -0.53%  [.] sinsp_filter_check::extract_nocache
    13.40%     +0.49%  [.] sinsp_evt_formatter::tostring_withformat

Heap diff from master - unit tests

total runtime: 0.10s.
calls to allocation functions: -5935 (-60561/s)
temporary memory allocations: -465 (-4744/s)
peak heap memory consumption: -2.43K
peak RSS (including heaptrack overhead): 0B
total memory leaked: 0B

Heap diff from master - scap file

total runtime: 0.01s.
calls to allocation functions: 0 (0/s)
temporary memory allocations: -3 (-214/s)
peak heap memory consumption: 0B
peak RSS (including heaptrack overhead): 0B
total memory leaked: 0B

Copy link
Contributor

@sgaist sgaist left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good cleanup !

Comment on lines 100 to 107
if (!sanitized_name.empty() && !(std::isalnum(sanitized_name.front()) || sanitized_name.front() == '_'))
{
sanitized_name = "_" + sanitized_name;
}
if (!sanitized_name.empty() && sanitized_name.back() == '_')
{
sanitized_name.pop_back();
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure to follow the logic:
if not empty, an underscore is prepended if the first char is not alpha numeric or if it is already an underscore.
The former should not happen based on the RegEx and for the latter, the documentation state that double underscore starting names are reserved for internal use.

Also, if the metric name is empty, then nothing happen, shouldn't it scream about it ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Misread the regex and thought for the metric name also numbers are allowed in front. Plus yes the !sanitized_name.empty() was an error here. Re-pushed to 100% strictly stay with the prometheus mandates.

Right now we would replace and empty metrics name with _. We chatted a bit over slack and are unsure how to best handle it. Logs could be too noisy and since metrics are not that critical, we certainly don't want to error out or anything. Maybe simply not adding the metric is the right solution.

Comment on lines 93 to 126
std::string prometheus_sanitize_metric_name(const std::string& name)
{
// https://prometheus.io/docs/concepts/data_model/#metric-names-and-labels
static const RE2 invalid_chars("[^a-zA-Z0-9_]");
std::string sanitized_name = name;
RE2::GlobalReplace(&sanitized_name, invalid_chars, "_");
RE2::GlobalReplace(&sanitized_name, "_+", "_");
if (!sanitized_name.empty() && !(std::isalnum(sanitized_name.front()) || sanitized_name.front() == '_'))
{
sanitized_name = "_" + sanitized_name;
}
if (!sanitized_name.empty() && sanitized_name.back() == '_')
{
sanitized_name.pop_back();
}
return sanitized_name;
}

std::string prometheus_sanitize_label_name(const std::string& name)
{
// https://prometheus.io/docs/concepts/data_model/#metric-names-and-labels
static const RE2 invalid_chars("[^a-zA-Z0-9_]");
std::string sanitized_label = name;
RE2::GlobalReplace(&sanitized_label, invalid_chars, "_");
RE2::GlobalReplace(&sanitized_label, "_+", "_");

// Ensure the label starts with a letter or underscore (if empty after sanitizing, set to "_")
if (sanitized_label.empty() || (!std::isalpha(sanitized_label.front()) && sanitized_label.front() != '_'))
{
sanitized_label = "_" + sanitized_label;
}

return sanitized_label;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unrelated to the logic question, I would factor out the common part to avoid code duplication since both are using exactly the same cleanup starting phase.

Suggested change
std::string prometheus_sanitize_metric_name(const std::string& name)
{
// https://prometheus.io/docs/concepts/data_model/#metric-names-and-labels
static const RE2 invalid_chars("[^a-zA-Z0-9_]");
std::string sanitized_name = name;
RE2::GlobalReplace(&sanitized_name, invalid_chars, "_");
RE2::GlobalReplace(&sanitized_name, "_+", "_");
if (!sanitized_name.empty() && !(std::isalnum(sanitized_name.front()) || sanitized_name.front() == '_'))
{
sanitized_name = "_" + sanitized_name;
}
if (!sanitized_name.empty() && sanitized_name.back() == '_')
{
sanitized_name.pop_back();
}
return sanitized_name;
}
std::string prometheus_sanitize_label_name(const std::string& name)
{
// https://prometheus.io/docs/concepts/data_model/#metric-names-and-labels
static const RE2 invalid_chars("[^a-zA-Z0-9_]");
std::string sanitized_label = name;
RE2::GlobalReplace(&sanitized_label, invalid_chars, "_");
RE2::GlobalReplace(&sanitized_label, "_+", "_");
// Ensure the label starts with a letter or underscore (if empty after sanitizing, set to "_")
if (sanitized_label.empty() || (!std::isalpha(sanitized_label.front()) && sanitized_label.front() != '_'))
{
sanitized_label = "_" + sanitized_label;
}
return sanitized_label;
}
std::string _sanitize_name(const std::string& name)
{
// https://prometheus.io/docs/concepts/data_model/#metric-names-and-labels
static const RE2 invalid_chars("[^a-zA-Z0-9_]");
std::string sanitized_name = name;
// RE2::Extract(name, invalid_chars, "_", &sanitized_name);
RE2::GlobalReplace(&sanitized_name, invalid_chars, "_");
RE2::GlobalReplace(&sanitized_name, "_+", "_");
return sanitized_name;
}
std::string prometheus_sanitize_metric_name(const std::string& name)
{
std::string sanitized_name = _sanitize_name(name);
if (!sanitized_name.empty())
{
if (!(std::isalnum(sanitized_name.front()) || sanitized_name.front() == '_'))
{
sanitized_name.insert(0, "_");
}
if (sanitized_name.back() == '_')
{
sanitized_name.pop_back();
}
}
return sanitized_name;
}
std::string prometheus_sanitize_label_name(const std::string& name)
{
std::string sanitized_label = _sanitize_name(name);
// Ensure the label starts with a letter or underscore (if empty after sanitizing, set to "_")
if (sanitized_label.empty() || (!std::isalpha(sanitized_label.front()) && sanitized_label.front() != '_'))
{
sanitized_label.insert(0, "_");
}
return sanitized_label;
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added an alternative solution as second co-authored commit, let me know.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sgaist kind bump, ty!

{
prometheus_text += "," + key + "=\"" + value + "\"" ;
prometheus_text += " "; // white space at the end important!
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
prometheus_text += " "; // white space at the end important!
prometheus_text += " "; // the white space at the end is important!

}
prometheus_text += prometheus_sanitize_label_name(key) + "=\"" + value + "\"";
}
prometheus_text += "} "; // white space at the end important!
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
prometheus_text += "} "; // white space at the end important!
prometheus_text += "} "; // the white space at the end is important!

@poiana
Copy link
Contributor

poiana commented Jul 8, 2024

@sgaist: changing LGTM is restricted to collaborators

In response to this:

That's a good cleanup !

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

incertum and others added 2 commits July 8, 2024 21:22
Copy link

github-actions bot commented Jul 8, 2024

Perf diff from master - unit tests

    11.93%     -2.57%  [.] sinsp_parser::reset
     4.20%     +1.28%  [.] sinsp_parser::process_event
     4.81%     +1.20%  [.] sinsp_thread_manager::find_thread
     4.64%     -1.20%  [.] next
     7.16%     -0.75%  [.] sinsp::next
     0.85%     -0.61%  [.] scap_next
     1.00%     +0.51%  [.] sinsp::fetch_next_event
     1.25%     -0.49%  [.] sinsp_fdtable::find_ref
     0.74%     +0.47%  [.] sinsp_parser::event_cleanup
     1.06%     -0.46%  [.] sinsp_threadinfo::~sinsp_threadinfo

Perf diff from master - scap file

    13.88%     -3.24%  [.] sinsp_evt_formatter::tostring_withformat
    13.84%     -3.23%  [.] sinsp_filter_check::extract_nocache
     3.43%     +2.48%  [.] std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::_M_construct<char const*>
     6.97%     +2.40%  [.] std::_Hashtable<long, std::pair<long const, std::shared_ptr<sinsp_threadinfo> >, std::allocator<std::pair<long const, std::shared_ptr<sinsp_threadinfo> > >, std::__detail::_Select1st, std::equal_to<long>, std::hash<long>, std::__detail::_Mod_range_hashing, std::__detail::_Default_ranged_hash, std::__detail::_Prime_rehash_policy, std::__detail::_Hashtable_traits<false, false, true> >::_M_find_before_node
     3.49%     +2.29%  [.] sinsp_filter_check_event::extract_single
     3.26%     +2.14%  [.] std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::_M_construct<char*>
     3.48%     +2.08%  [.] libsinsp::runc::match_container_id
     3.48%     +2.00%  [.] 0x00000000000a7310
     6.95%     -1.62%  [.] sinsp::next
     6.89%     -1.61%  [.] sinsp_parser::reset

Heap diff from master - unit tests

total runtime: 0.02s.
calls to allocation functions: -10684 (-534200/s)
temporary memory allocations: -729 (-36450/s)
peak heap memory consumption: 1.40K
peak RSS (including heaptrack overhead): 0B
total memory leaked: 0B

Heap diff from master - scap file

total runtime: 0.02s.
calls to allocation functions: 0 (0/s)
temporary memory allocations: -3 (-130/s)
peak heap memory consumption: 0B
peak RSS (including heaptrack overhead): 0B
total memory leaked: 0B

@poiana
Copy link
Contributor

poiana commented Aug 8, 2024

LGTM label has been added.

Git tree hash: 6d68da1fdf4ccef12e2abfe2ead351d23db0bb43

Copy link
Contributor

@FedeDP FedeDP left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/approve

Copy link
Member

@Andreagit97 Andreagit97 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/approve

@poiana
Copy link
Contributor

poiana commented Aug 20, 2024

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: Andreagit97, FedeDP, incertum

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:
  • OWNERS [Andreagit97,FedeDP,incertum]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@poiana poiana merged commit 1dc7459 into falcosecurity:master Aug 20, 2024
42 of 43 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants