-
Notifications
You must be signed in to change notification settings - Fork 121
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
exometer_report_statsd becomes a CPU and memory hog when system time jumps forward #35
Comments
I will take a look. The purpose of the |
Hi. Any updates on this front? |
No, apologies. I noticed that I had an un-pushed branch where I had made some simple corrections to clock changes. I just pushed and created a PR (#44), but will say honestly that I haven't tested except verify that it doesn't break the (unmodified) test suite. |
Hi @uwiger. I have tried your branch and it seems to crash the reporter soon after application start. I've added a few printf inside adjust_interval(Time, T0) ->
io:format("adjust_interval(~p, ~p)~n", [Time, T0]),
T1 = os:timestamp(),
case tdiff(T1, T0) of
D when D > Time ->
io:format("D = ~p, T1 = ~p~n", [D, T1]),
%% Most likely due to clock adjustment
{Time, T1};
D ->
io:format("D = ~p, T0 = ~p~n", [D, T0]),
{D, T0}
end. and here's the output I see when starting my app (the error report is using Elixir syntax):
|
@alco Could you create a test case which reproduces that error? |
Any updates on this? |
And I can also confirm, that the change from @uwiger crashes with this error right after start before any metrics are reported (I've merged the change onto the latest master).
|
We have a few reporters configured with the interval of 1000 ms using
exometer_reporter_statsd
. Whenever the system time jumps forward, theexometer_report_statsd
can't keep up with the amount of incoming{exometer_report, ...}
messages and, given enough metrics, the memory used by the VM will keep growing indefinitely.The issue originates from this line where the second argument becomes negative after a forward time jump. The underlying issue is that the internal timestamp
exometer_report
is keeping can take a long time to catch up with the system's time because it is only increasing in multiples of the specified report interval here.So if we have the report interval of 1000 ms and the time jumps forward by one hour, it will generate at least 3600 reports with interval 0 until it catches up. In a real app where we have quite a few metrics and we haven't seen it catch up at all when using
exometer_report_statsd
: the message queue would grow continuously meanwhile the Erlang process would be eating 200% CPU. When I log the value ofTime - tdiff(T1, T0)
to the console, it goes down a bit and then jumps back up, so it's kind of stuck in a loop.Using
exometer_report_tty
instead does not result in that degenerate behaviour and it would catch up rather quickly back to the current time.We would expect the
exometer_report_statsd
reporter to only send a single report in the case of the time jump because there is no useful info that can be collected between two reports which have interval 0 between them.The text was updated successfully, but these errors were encountered: