Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Application hangs after sending 10.000 messages when LogBroker is not running #12

Open
luebbe opened this issue Feb 3, 2021 · 8 comments

Comments

@luebbe
Copy link

luebbe commented Feb 3, 2021

I'm trying to track down a problem when our application hangs when using GrijjyLog and the log broker is not running. After about 10.000 log messages (10-15 minutes), the application suddenly becomes unresponsive (Windows 10, Delphi 10.3.3 or 10.4.1).
When I start the log broker, the application becomes responsive immediately. It is, as if the suddenly available connection to the log broker is un(b)locking something.
When the broker runs all the time, the application runs for hours without problems.
When I log to a memo instead of using grijjylog.send, the application also runs for hours.

I put a breakpoint on FMessageQueueLock.Leave in

procedure TZMQProtocol.Send(var AMsg: PZMessage);
...
  FMessageQueueLock.Enter;
  try
    FMessageQueue.Enqueue(AMsg);
  finally
    FMessageQueueLock.Leave;
  end;
  { the zmsg_send will automatically destroy the message, so we need to
   prevent the destruction of the object by our own method }
  AMsg := nil;
end;

with FMessageQueue.count > 10 as condition, but it only triggered a few times during startup of the application and never later. So it's probably not FMessageQueue running full.

Basically the question is "What happens if nobody listens for a long time"?

Any help is greatly appreciated.

@luebbe
Copy link
Author

luebbe commented Feb 4, 2021

This bug is 100% reproducible with the provided sample apps.
Steps to reproduce:

  1. start the log viewer
  2. start the FMX sample client
    1. click on "connect"
    2. go to "other"
    3. send 10.000 messages
    4. send 10.000 messages
      -> sample client hangs
  3. start the log broker
    -> sample client is responsive again
    -> exactly 10.000 messages are immediately displayed in log viewer.
    Where are the other 10.000? So either the protocol discards 10.000 messages, but still blocks the sending application or the log viewer is limited to 10.000 messages?
    The timestamp of the messages is the timestamp when they were received by the client, not the timestamp when they were sent.

If you want you can:

  1. stop the log broker
  2. repeat 2.iii until the client hangs again

@luebbe luebbe changed the title Application hangs after a while when LogBroker is not running Application hangs after sending 10.000 messages when LogBroker is not running Feb 4, 2021
@allendrennan
Copy link
Collaborator

This is not really a bug, it's a feature. ZeroMQ is designed so when messages are streamed between nodes and the target node is not available, the sender can queue messages until the node comes back online. We set the upper limit to 10K for sending and receive queues so memory is not consumed indefinitely (see SND_HWM constant in the following),
https://github.com/grijjy/DelphiZeroMQ/blob/master/PascalZMQ.pas

ZeroMQ lets you override this behavior and set the value to 0, which means unlimited see,
http://api.zeromq.org/3-0:zmq-getsockopt
Specially the parameters, ZMQ_SNDHWM and ZMQ_RCVHWM.

GrijjyLogBroker was never intended to operate as a forever running remote logger, but you could certainly create a custom version of the concept using the ExampleWorker, ExampleBroker and ExampleClient projects that we provide in https://github.com/grijjy/DelphiZeroMQ that more precisely matches your needs.

@luebbe
Copy link
Author

luebbe commented Feb 4, 2021

Thanks for your reply.
Is there any notification that a sender can not get rid of the messages (and the queue is filling up)? The current implementation blocks the sending application hard until the receiver comes online. I'm not interested in raising any limits. That would just make the application hang later and not change anything. A queue size of 10 is as good as 10.000.

If I knew that there's no broker listening, I wouldn't send any log messages.

Is there some option to tell ZMQ to discard messages that it can't send?

@allendrennan
Copy link
Collaborator

ZeroMQ has the ability to queue to disk. This article covers the subject,
https://dzone.com/articles/zeromq-flow-control-and-other

@luebbe
Copy link
Author

luebbe commented Feb 4, 2021

Queuing to disk doesn't solve the problem, it only delays it. IIUC, there's not option not to send a message if delivery is not possible, because ZMQ wants to deliver messages as reliably as possible and rather blocks the sender than telling it "sorry, I can't handle that now, try again later" and allowing it to continue.
This doesn't even allow the sender to handle the problem.

@allendrennan
Copy link
Collaborator

I am not an expert in ZeroMQ but they have a concept of non-blocking, ZMQ_NOBLOCK. Like I mentioned before, the Grijjy logger isn't designed to work in the way you want, nor does it expose every possible pattern. You can build these types of solutions with various patterns using our ZeroMQ header translations for Delphi.

@luebbe
Copy link
Author

luebbe commented Feb 4, 2021

Thanks, I'll see what I get done. For the time being I disabled logging by default. It was nice to be able to fire up the log broker and -viewer at any time while the application is running and see the log messages coming in.

@luebbe
Copy link
Author

luebbe commented Feb 5, 2021

BTW: I'd suggest to leave this issue open and tag it as "question", because other people might run into the same issue too.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants