Replies: 5 comments 9 replies
-
+1 for exporting the kernel event rate and other useful metrics about Falco to Prometheus. Specifically, it would be great if Falco Maintainers could create a panel to visualise the kernel event rate. This could be achieved with a Prometheus query using the kernel event rate metrics that would be exposed to Prometheus and a PromQL rate e.g.: Note:
I would like to propose a different approach to consider instead of cAdvisor for surfacing conventional SRE metrics: let's first evaluate the metrics available from kube-state-metrics. One benefit is that this is a Kubernetes project, so it's consistent with the WG's goal of creating an architectural reference with tools from the CNCF ecosystem. I propose that we use the metrics surfaced by ksm that are exported to Prometheus and then use these metrics + PromQL to create meaningful Prometheus queries that are then visualised in the panels of our Grafana dashboard. |
Beta Was this translation helpful? Give feedback.
-
Could you give us a list of the SRE metrics you would like to be monitored for the Falco project?
On this comment you also provided some other metrics: #11 (comment) The mapping is not straightforward to me: I guess But then for the other I am not sure, could you help me figuring this out? |
Beta Was this translation helpful? Give feedback.
-
Hi there! So it looks like this repo already has a full kube-prometheus stack deployed (judging from the HelmRelease here). The interesting thing is that kube-prometheus includes:
It'd be worth clarifying what is the intent here.
One more point worth mentioning is that the current configuration is a bit opaque. It's a great start in that all pieces are deployed already, but it's coming at the cost of granularity: by that I mean that we take an entire HelmChart or repo and deploy that, without really having a sense of what gets deployed. It might be worth considering rendering charts and manifests and committing the output, so as to understand what resources are actually there. Hopefully that helps moving this issue forward! 🙂 I'd be happy to contribute a couple PRs if necessary! |
Beta Was this translation helpful? Give feedback.
-
@rossf7 could we mark this discussion as concluded? See my last comment #14 (reply in thread). |
Beta Was this translation helpful? Give feedback.
-
Hi @incertum, yes sure. I've added a note so we include this when writing the proposal for collecting metrics. |
Beta Was this translation helpful? Give feedback.
-
Since Falco is the first project to onboard to the TAG Environmental Sustainability Green Reviews Initiative, there is an opportunity to discuss the metrics reporting responsibilities to lay the foundation for organic growth of the initiative.
Proposing the following based on previous discussions:
Project: (If applicable) reports custom internal metrics to a Green Reviews-hosted Prometheus. The project assists in creating a meaningful Grafana dashboard.
Green Review:
SRE Metrics
: Even if the project is capable of supplying similar metrics, the Green Review team uniformly logs traditionalSRE Metrics
across all namespaces. This complements the Kepler energy metrics and benefits the easier understanding of resource utilization impacts. The Green Review team manages this deployment. One option could be cAdvisor as dameonset feeding into Prometheus, which then feeds into a Grafana dashboard accessible by the projects.References:
SRE Metrics
: Traditional metrics related to CPU usage and memory usage, as outlined on Falco's page https://falco.org/docs/metrics/performance/ or checkout https://github.com/google/cadvisor + more universally useful metrics (TBD)CC @AntonioDiTuri @nikimanoledaki @rossf7
Beta Was this translation helpful? Give feedback.
All reactions