Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(webserver): implement metrics endpoint #3140

Merged
merged 22 commits into from
May 3, 2024
Merged
Show file tree
Hide file tree
Changes from 21 commits
Commits
Show all changes
22 commits
Select commit Hold shift + click to select a range
aa4ae6f
feat(webserver): implement metrics endpoint
sgaist Mar 15, 2024
d527083
fix(webserver): correct enabled check
sgaist Mar 19, 2024
1bb9b4e
fix(webserver): remove extra line return
sgaist Mar 19, 2024
e0f9c92
fix(webserver): use falcosecurity as metric namespace
sgaist Mar 19, 2024
d6e980e
refactor(webserver): move metrics endpoint activation under webserver
sgaist Mar 22, 2024
0cf84c8
refactor(configuration): move webserver items in own struct
sgaist Mar 24, 2024
beda9dd
refactor(metrics): move metrics handling to its own class
sgaist Mar 26, 2024
9d3e602
fix(metrics): correct metrics namespace
sgaist Mar 29, 2024
d575380
fix(metrics): correct static metrics
sgaist Mar 29, 2024
c940ceb
fix(metrics): correct hostname metrics name and namespace
sgaist Apr 4, 2024
71e1783
doc(falco_metrics): add basic documentation
sgaist Apr 10, 2024
459f4bd
refactor(falco_metrics): put content type in documented constant
sgaist Apr 10, 2024
aa0e813
refactor(metrics): make to_text get the application state
sgaist Apr 16, 2024
34436d6
refactor(metrics): use prometheus_metrics_enabled for configuration
sgaist Apr 16, 2024
6746e3b
feat(falco_metrics): add outputs_queue_num_drops
sgaist Apr 21, 2024
2c2db8d
feat(falco_metrics): add duration_sec
sgaist Apr 21, 2024
2a60be3
feat(falco_metrics): add event sources
sgaist Apr 21, 2024
494ecc2
fix(falco_metrics): remove redundant falco in version metrics
sgaist Apr 24, 2024
934df4c
fix(falco_metrics): make duration_sec a count and not a timestamp
sgaist Apr 24, 2024
b587754
chore(configuration): add reference to Prometheus endpoint in metrics…
sgaist Apr 24, 2024
815af40
fix(falco_metrics): make duration_sec and outputs_queue_num_drops mon…
sgaist Apr 24, 2024
77a6a71
fix(falco_metrics): remove falco_ prefix for version
sgaist Apr 29, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 6 additions & 1 deletion falco.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -695,6 +695,9 @@ webserver:
# Can be an IPV4 or IPV6 address, defaults to IPV4
listen_address: 0.0.0.0
k8s_healthz_endpoint: /healthz
# Enable the metrics endpoint providing Prometheus values
# It will only have an effect if metrics.enabled is set to true as well.
prometheus_metrics_enabled: false
sgaist marked this conversation as resolved.
Show resolved Hide resolved
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@leogr do we need to enforce a feat status for this? Like Incubating ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be preferable, IMO. It is not a blocker for this PR anyway. We may reaudit all options later, but still before the 0.38 release. Does it make sense?

cc @falcosecurity/falco-maintainers

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@leogr proposing to merge it as is and address the follow up items in a new PR. I believe we need another touch up PR anyways.

ssl_enabled: false
ssl_certificate: /etc/falco/falco.pem

Expand Down Expand Up @@ -967,7 +970,9 @@ syscall_event_drops:
# beneficial for exploring the data schema and ensuring that fields with empty
# values are included in the output.
#
# todo: prometheus export option
# If metrics are enabled, the web server can be configured to activate the
# corresponding Prometheus endpoint using webserver.prometheus_metrics_enabled.
#
sgaist marked this conversation as resolved.
Show resolved Hide resolved
# todo: syscall_counters_enabled option
metrics:
enabled: false
Expand Down
2 changes: 1 addition & 1 deletion unit_tests/falco/test_configuration.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -323,7 +323,7 @@ TEST(Configuration, configuration_webserver_ip)

EXPECT_NO_THROW(falco_config.init(cmdline_config_options));

ASSERT_EQ(falco_config.m_webserver_listen_address, address);
ASSERT_EQ(falco_config.m_webserver_config.m_listen_address, address);
}

std::vector<std::string> invalid_addresses = {"327.0.0.1",
Expand Down
1 change: 1 addition & 0 deletions userspace/falco/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -92,6 +92,7 @@ if(CMAKE_SYSTEM_NAME MATCHES "Linux" AND NOT MINIMAL_BUILD)
PRIVATE
outputs_grpc.cpp
outputs_http.cpp
falco_metrics.cpp
webserver.cpp
grpc_context.cpp
grpc_server_impl.cpp
Expand Down
37 changes: 16 additions & 21 deletions userspace/falco/app/actions/start_webserver.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -24,53 +24,48 @@ limitations under the License.
using namespace falco::app;
using namespace falco::app::actions;

falco::app::run_result falco::app::actions::start_webserver(falco::app::state& s)
falco::app::run_result falco::app::actions::start_webserver(falco::app::state& state)
{
#if !defined(_WIN32) && !defined(__EMSCRIPTEN__) && !defined(MINIMAL_BUILD)
if(!s.is_capture_mode() && s.config->m_webserver_enabled)
if(!state.is_capture_mode() && state.config->m_webserver_enabled)
{
if (s.options.dry_run)
if (state.options.dry_run)
{
falco_logger::log(falco_logger::level::DEBUG, "Skipping starting webserver in dry-run\n");
return run_result::ok();
}

std::string ssl_option = (s.config->m_webserver_ssl_enabled ? " (SSL)" : "");

falco_configuration::webserver_config webserver_config = state.config->m_webserver_config;
std::string ssl_option = (webserver_config.m_ssl_enabled ? " (SSL)" : "");
falco_logger::log(falco_logger::level::INFO, "Starting health webserver with threadiness "
+ std::to_string(s.config->m_webserver_threadiness)
+ std::to_string(webserver_config.m_threadiness)
+ ", listening on "
+ s.config->m_webserver_listen_address
+ webserver_config.m_listen_address
+ ":"
+ std::to_string(s.config->m_webserver_listen_port)
+ std::to_string(webserver_config.m_listen_port)
+ ssl_option + "\n");

s.webserver.start(
s.offline_inspector,
s.config->m_webserver_threadiness,
s.config->m_webserver_listen_port,
s.config->m_webserver_listen_address,
s.config->m_webserver_k8s_healthz_endpoint,
s.config->m_webserver_ssl_certificate,
s.config->m_webserver_ssl_enabled);
state.webserver.start(
state,
webserver_config);
}
#endif
return run_result::ok();
}

falco::app::run_result falco::app::actions::stop_webserver(falco::app::state& s)
falco::app::run_result falco::app::actions::stop_webserver(falco::app::state& state)
{
#if !defined(_WIN32) && !defined(__EMSCRIPTEN__) && !defined(MINIMAL_BUILD)
if(!s.is_capture_mode() && s.config->m_webserver_enabled)
if(!state.is_capture_mode() && state.config->m_webserver_enabled)
{
if (s.options.dry_run)
if (state.options.dry_run)
{
falco_logger::log(falco_logger::level::DEBUG, "Skipping stopping webserver in dry-run\n");
return run_result::ok();
}

s.webserver.stop();
state.webserver.stop();
}
#endif
return run_result::ok();
}

26 changes: 11 additions & 15 deletions userspace/falco/configuration.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -60,11 +60,6 @@ falco_configuration::falco_configuration():
m_grpc_enabled(false),
m_grpc_threadiness(0),
m_webserver_enabled(false),
m_webserver_threadiness(0),
m_webserver_listen_port(8765),
m_webserver_listen_address("0.0.0.0"),
m_webserver_k8s_healthz_endpoint("/healthz"),
m_webserver_ssl_enabled(false),
m_syscall_evt_drop_threshold(.1),
m_syscall_evt_drop_rate(.03333),
m_syscall_evt_drop_max_burst(1),
Expand Down Expand Up @@ -372,21 +367,22 @@ void falco_configuration::load_yaml(const std::string& config_name, const yaml_h
m_time_format_iso_8601 = config.get_scalar<bool>("time_format_iso_8601", false);

m_webserver_enabled = config.get_scalar<bool>("webserver.enabled", false);
m_webserver_threadiness = config.get_scalar<uint32_t>("webserver.threadiness", 0);
m_webserver_listen_port = config.get_scalar<uint32_t>("webserver.listen_port", 8765);
m_webserver_listen_address = config.get_scalar<std::string>("webserver.listen_address", "0.0.0.0");
if(!re2::RE2::FullMatch(m_webserver_listen_address, ip_address_re))
m_webserver_config.m_threadiness = config.get_scalar<uint32_t>("webserver.threadiness", 0);
m_webserver_config.m_listen_port = config.get_scalar<uint32_t>("webserver.listen_port", 8765);
m_webserver_config.m_listen_address = config.get_scalar<std::string>("webserver.listen_address", "0.0.0.0");
if(!re2::RE2::FullMatch(m_webserver_config.m_listen_address, ip_address_re))
{
throw std::logic_error("Error reading config file (" + config_name + "): webserver listen address \"" + m_webserver_listen_address + "\" is not a valid IP address");
throw std::logic_error("Error reading config file (" + config_name + "): webserver listen address \"" + m_webserver_config.m_listen_address + "\" is not a valid IP address");
}

m_webserver_k8s_healthz_endpoint = config.get_scalar<std::string>("webserver.k8s_healthz_endpoint", "/healthz");
m_webserver_ssl_enabled = config.get_scalar<bool>("webserver.ssl_enabled", false);
m_webserver_ssl_certificate = config.get_scalar<std::string>("webserver.ssl_certificate", "/etc/falco/falco.pem");
if(m_webserver_threadiness == 0)
m_webserver_config.m_k8s_healthz_endpoint = config.get_scalar<std::string>("webserver.k8s_healthz_endpoint", "/healthz");
m_webserver_config.m_ssl_enabled = config.get_scalar<bool>("webserver.ssl_enabled", false);
m_webserver_config.m_ssl_certificate = config.get_scalar<std::string>("webserver.ssl_certificate", "/etc/falco/falco.pem");
if(m_webserver_config.m_threadiness == 0)
{
m_webserver_threadiness = falco::utils::hardware_concurrency();
m_webserver_config.m_threadiness = falco::utils::hardware_concurrency();
}
m_webserver_config.m_prometheus_metrics_enabled = config.get_scalar<bool>("webserver.prometheus_metrics_enabled", false);

std::list<std::string> syscall_event_drop_acts;
config.get_sequence(syscall_event_drop_acts, "syscall_event_drops.actions");
Expand Down
17 changes: 11 additions & 6 deletions userspace/falco/configuration.h
Original file line number Diff line number Diff line change
Expand Up @@ -83,6 +83,16 @@ class falco_configuration
std::string m_root;
};

struct webserver_config {
uint32_t m_threadiness = 0;
uint32_t m_listen_port = 8765;
std::string m_listen_address = "0.0.0.0";
std::string m_k8s_healthz_endpoint = "/healthz";
bool m_ssl_enabled = false;
std::string m_ssl_certificate;
bool m_prometheus_metrics_enabled = false;
};

falco_configuration();
virtual ~falco_configuration() = default;

Expand Down Expand Up @@ -120,12 +130,7 @@ class falco_configuration
std::string m_grpc_root_certs;

bool m_webserver_enabled;
uint32_t m_webserver_threadiness;
uint32_t m_webserver_listen_port;
std::string m_webserver_listen_address;
std::string m_webserver_k8s_healthz_endpoint;
bool m_webserver_ssl_enabled;
std::string m_webserver_ssl_certificate;
webserver_config m_webserver_config;

syscall_evt_drop_actions m_syscall_evt_drop_actions;
double m_syscall_evt_drop_threshold;
Expand Down
149 changes: 149 additions & 0 deletions userspace/falco/falco_metrics.cpp
Original file line number Diff line number Diff line change
@@ -0,0 +1,149 @@
// SPDX-License-Identifier: Apache-2.0
/*
Copyright (C) 2024 The Falco Authors.

Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
*/

#include "falco_metrics.h"

#include "app/state.h"

#include <libsinsp/sinsp.h>

/*!
\class falco_metrics
\brief This class is used to convert the metrics provided by the application
and falco libs into a string to be return by the metrics endpoint.
*/

/*!
\brief content_type to be returned by the webserver's metrics endpoint.

Currently it is the default Prometheus exposition format

https://prometheus.io/docs/instrumenting/exposition_formats/#text-based-format
*/
const std::string falco_metrics::content_type = "text/plain; version=0.0.4";


/*!
\brief this method takes an application \c state and returns a textual representation of
its configured metrics.

The current implementation returns a Prometheus exposition formatted string.
*/
std::string falco_metrics::to_text(const falco::app::state& state)
{
static const char* all_driver_engines[] = {
BPF_ENGINE, KMOD_ENGINE, MODERN_BPF_ENGINE,
SOURCE_PLUGIN_ENGINE, NODRIVER_ENGINE, GVISOR_ENGINE };

std::vector<sinsp*> inspectors;
std::vector<libs::metrics::libs_metrics_collector> metrics_collectors;

for (const auto& source_info: state.source_infos)
{
sinsp *source_inspector = source_info.inspector.get();
inspectors.push_back(source_inspector);
metrics_collectors.push_back(libs::metrics::libs_metrics_collector(source_inspector, state.config->m_metrics_flags));
}

libs::metrics::prometheus_metrics_converter prometheus_metrics_converter;
std::string prometheus_text;

for (auto* inspector: inspectors)
{
for (size_t i = 0; i < sizeof(all_driver_engines) / sizeof(const char*); i++)
{
if (inspector->check_current_engine(all_driver_engines[i]))
{
prometheus_text += prometheus_metrics_converter.convert_metric_to_text_prometheus("engine_name", "falcosecurity", "scap", {{"engine_name", all_driver_engines[i]}});
break;
}
}

const scap_agent_info* agent_info = inspector->get_agent_info();
const scap_machine_info* machine_info = inspector->get_machine_info();

libs::metrics::libs_metrics_collector libs_metrics_collector(inspector, 0);

prometheus_text += prometheus_metrics_converter.convert_metric_to_text_prometheus("falco_version", "falcosecurity", "falco", {{"version", FALCO_VERSION}});
sgaist marked this conversation as resolved.
Show resolved Hide resolved
prometheus_text += prometheus_metrics_converter.convert_metric_to_text_prometheus("kernel_release", "falcosecurity", "falco", {{"kernel_release", agent_info->uname_r}});
prometheus_text += prometheus_metrics_converter.convert_metric_to_text_prometheus("hostname", "falcosecurity", "evt", {{"hostname", machine_info->hostname}});

for (const std::string& source: inspector->event_sources())
{
prometheus_text += prometheus_metrics_converter.convert_metric_to_text_prometheus("evt_source", "falcosecurity", "falco", {{"evt_source", source}});
}
std::vector<metrics_v2> static_metrics;
static_metrics.push_back(libs_metrics_collector.new_metric("start_ts",
sgaist marked this conversation as resolved.
Show resolved Hide resolved
METRICS_V2_MISC,
METRIC_VALUE_TYPE_U64,
METRIC_VALUE_UNIT_TIME_TIMESTAMP_NS,
METRIC_VALUE_METRIC_TYPE_NON_MONOTONIC_CURRENT,
agent_info->start_ts_epoch));
static_metrics.push_back(libs_metrics_collector.new_metric("host_boot_ts",
METRICS_V2_MISC,
METRIC_VALUE_TYPE_U64,
METRIC_VALUE_UNIT_TIME_TIMESTAMP_NS,
METRIC_VALUE_METRIC_TYPE_NON_MONOTONIC_CURRENT,
machine_info->boot_ts_epoch));
static_metrics.push_back(libs_metrics_collector.new_metric("host_num_cpus",
METRICS_V2_MISC,
METRIC_VALUE_TYPE_U32,
METRIC_VALUE_UNIT_COUNT,
METRIC_VALUE_METRIC_TYPE_NON_MONOTONIC_CURRENT,
machine_info->num_cpus));
static_metrics.push_back(libs_metrics_collector.new_metric("outputs_queue_num_drops",
METRICS_V2_MISC,
METRIC_VALUE_TYPE_U64,
METRIC_VALUE_UNIT_COUNT,
METRIC_VALUE_METRIC_TYPE_MONOTONIC,
state.outputs->get_outputs_queue_num_drops()));

auto now = std::chrono::duration_cast<std::chrono::nanoseconds>(std::chrono::system_clock::now().time_since_epoch()).count();

static_metrics.push_back(libs_metrics_collector.new_metric("duration_sec",
METRICS_V2_MISC,
METRIC_VALUE_TYPE_U64,
METRIC_VALUE_UNIT_TIME_S_COUNT,
METRIC_VALUE_METRIC_TYPE_MONOTONIC,
(uint64_t)((now - agent_info->start_ts_epoch) / ONE_SECOND_IN_NS)));

for (auto metrics: static_metrics)
incertum marked this conversation as resolved.
Show resolved Hide resolved
{
prometheus_metrics_converter.convert_metric_to_unit_convention(metrics);
prometheus_text += prometheus_metrics_converter.convert_metric_to_text_prometheus(metrics, "falcosecurity", "falco");
}
}

for (auto metrics_collector: metrics_collectors)
{
metrics_collector.snapshot();
auto metrics_snapshot = metrics_collector.get_metrics();

for (auto& metrics: metrics_snapshot)
{
prometheus_metrics_converter.convert_metric_to_unit_convention(metrics);
std::string namespace_name = "scap";
if (metrics.flags & METRICS_V2_RESOURCE_UTILIZATION || metrics.flags & METRICS_V2_KERNEL_COUNTERS)
{
namespace_name = "falco";
}
prometheus_text += prometheus_metrics_converter.convert_metric_to_text_prometheus(metrics, "falcosecurity", namespace_name);
}

}
return prometheus_text;
}
32 changes: 32 additions & 0 deletions userspace/falco/falco_metrics.h
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
// SPDX-License-Identifier: Apache-2.0
/*
Copyright (C) 2024 The Falco Authors.

Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
*/
#pragma once

#include "configuration.h"

#include <libsinsp/sinsp.h>

namespace falco::app {
struct state;
}

class falco_metrics
{
public:
static const std::string content_type;
static std::string to_text(const falco::app::state& state);
};
Loading
Loading