-
Notifications
You must be signed in to change notification settings - Fork 102
Keep Alive
Author: Lucas Peetz Dulley
State: Design
The goal of this feature is to instrument Collage LocalNodes to be aware of its connected remote notes status/round-trip-time(?)/lastAliveTime. With this information together the eq-timeouts the unresponsive clients can be dealt in a more conscious manner.
-
Usage of keep-alive feature is optional and defaults to off
-
Keep-alive (ping) packet is sent when no reply packets, or data has been received for the node connection within a interval (half of the the global timeout?).
-
Applications can access keep-alive information of the client which is in an unresponsive or has times-out.
-
E.g.: EqPly can open heavy .ply files (e.g., lucy.ply) without failing due to frame rendering timeouts while the model is being distributed.
-
Whenever receiver thread receives data from a remote node, the lastAliveTime for the node is updated.
co::Node:
/** The node is responding (is alive) */
int64_t _lastAliveTime; //!< last time packets where received
int64_t co::Node::getLastAliveTime() // might be useful
virtual bool co::Node::isAlive() {} //pure virtual
co::LocalNode:
virtual bool isNodeAlive( NodePtr node ) {}
/** called from (ping/keepalive) thread */
bool co::LocalNode::sendPing( NodePtr remoteNode ); // sends a ping packet to remote node
/** process ping request. called from receiver thread (not queued in command queue) */
// updates lastAliveTime in receiver thread for the node which sent the packet
// sends a ping reply packet to "local" node
bool co::LocalNode::_cmdPing( Command& command );
/** process ping reply response. called from receiver thread (not queued in command queue) */
// updates lastAliveTime for the node which replied the packet
// remoteNode->_lastAliveTime = getTime64();
bool co::LocalNode::_cmdPingReply( Command& command );
/** variable indicating if keepalive signaling is on or off */
bool _keepAliveEnabled
Ping Packets:
/** NEW: node ping packet */
co::NodePingPacket
// uint64_t transmitTime;
/** NEW: node ping reply packet */
co::NodePingReplyPacket( const NodePingPacket* request): transmitTime( request->transmitTime );
// uint64_t transmitTime;
Keep-alive [ON|OFF] EQ-Config option?
This Collage-based keep-alive signal does take any action deciding whether a remote node is responsive or not. It only gathers and provides information about the actual remote nodes states from the local node perspective.
If round trip time (reception-transmit) is greater Global::getTimeout(), remote might be dead?
How to handle Inputframe/ timeout exceptions
Dealing with EQ timeout exceptions TIMEOUT_INPUTFRAME,TIMEOUT_FRAMESYNC eq\client\compositor.cpp eq\client\framedata.cpp eq\client\node.cpp
Through exception only if someone is considered unreachable. The application should be able to judge that.