-
Notifications
You must be signed in to change notification settings - Fork 37
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Draft PR][Don't merge until upgrade release rolls out] Fix telemetry spike issue in telegraf removal #926
Conversation
…into fixhispiketelemetry
…into fixhispiketelemetry
…into fixhispiketelemetry
This reverts commit 0bbd50f.
…into fixhispiketelemetryNew
This PR is stale because it has been open 7 days with no activity. Remove stale label or comment or this will be closed in 5 days. |
This PR is stale because it has been open 7 days with no activity. Remove stale label or comment or this will be closed in 5 days. |
This PR was closed because it has been stalled for 12 days with no activity. |
This PR is stale because it has been open 7 days with no activity. Remove stale label or comment or this will be closed in 5 days. |
This PR was closed because it has been stalled for 12 days with no activity. |
PR Description
This PR fixes the telemetry spike issue after removing telegraf. This fixes the bug in the ticker in telemetry aggregation and incorporates mutex locks for correctly getting metric values.
test cluster
test image: 6.8.14-fixhispiketelemetryNew-06-24-2024-c6cbed86
AI resource
AI query:
customMetrics
| where customDimensions contains "testrecalertssohamksm"
| extend agentversion=tostring(customDimensions.agentversion)
|where agentversion !contains "win"
| where customDimensions.agentversion contains "6.8.14-fixhispiketelemetryNew-06-24-2024-c6cbed86"
|extend agentversion=strcat(agentversion, "/", name)
| summarize count() by bin(timestamp,5m),agentversion
| render timechart
The below screenshot shows that volume has gone down now with the ticker fix. The telemetry spike was happening on the below metrics which uses the ticker - otelcollector_cpu_usage_050,otelcollector_cpu_usage_095,metricsextension_cpu_usage_050,metricsextension_cpu_usage_095,metricsextension_memory_rss_050, metricsextension_memory_rss_095,otelcollector_memory_rss_050,otelcollector_memory_rss_095.
The memory usage of the pods is not high anymore.
New Feature Checklist
Tests Checklist
operator
windows
arm64
arc-extension
fips
/tests
) added?