-
Notifications
You must be signed in to change notification settings - Fork 336
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SAMZA-2412 : Adding SamzaContainer metric that is 0 when container-starting, 1 when started. #1228
base: master
Are you sure you want to change the base?
SAMZA-2412 : Adding SamzaContainer metric that is 0 when container-starting, 1 when started. #1228
Conversation
You should use a gauge, not counter. In inGraphs, a counter is displayed as a rate (v2-v1/t2-t1).
A gauge should not reset to 0 if not updated, it should retain its last updated value. |
Making the new metric a gauge
Fair enough, I was confused between Timers and Gauges. |
@rmatharu Let's also add this to the metrics docs. |
@@ -764,6 +764,7 @@ class SamzaContainer( | |||
containerListener.afterStart() | |||
} | |||
metrics.containerStartupTime.set((System.nanoTime() - startTime)/1000000) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we use (container-startup-time
) metric instead to infer if the container has started?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we can but its relatively harder to automate since we need to look for changes
@@ -764,6 +764,7 @@ class SamzaContainer( | |||
containerListener.afterStart() | |||
} | |||
metrics.containerStartupTime.set((System.nanoTime() - startTime)/1000000) | |||
metrics.containerStarted.set(1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not related to this PR, but curious what will be the behavior of the auto sizing controller if this metric is broken. Wondering if it can differentiate between metrics broken vs container not started.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Without this (or another similar metric) the controller has no way of figuring out that a container has finished state-restore, upto which point any input-lag increase does not warrant scale-up
Adding metric that is 0 when container-starting, 1 when started.
Symptom: External monitoring and control (e.g., autosizing) require a way to know when a container has started (e.g., state-restore has completed). Currently, they have to rely on restore-time or container-startup-time metrics, which are a bit tricky to use.
Cause: None
Fix: Added a new metric, a gauge, which is 0 and remains at 0 after the container has started. A Timer on the other hand will go to 0 if its not updated.
Tests: Samza test job with and without state.