I see quite a few of these in the logs, yet they don't cause exceptions. We have 9.2.5 in active/passive replicated HA with 2 nodes, so due to HA the exception may be handled by the HA logic in the client API. I'd like to know whether a keep-alive expiry should cause an application detectable exception?
SwiftMQ sends keepalive messages from both ends of the connection. After 5 missing keepalives the connection is considered dead and closed. The reason are usually network problems or killed clients that left a half-open TCP connection.
I have another question regarding keep-alive: could a slow receiver cause a keep-alive "timeout"? I think the keep-alive messages are sent on the same TCP socket where the application level messages are. So could a slow receiver, due to message build-up, cause a reconnect because the TCP messages are not read off the socket and the keep-alive messages are queued on the receiving socket, or even on the socket on the server side in the TCP buffer? Or both the client and server APIs read off messages and queue them in memory leaving the TCP buffers empty at all times allowing free flow of keep-alive messages?
So if I see keep-alive counter reaching 0 it's 100% network problem and can't be an application problem whereby the TCP buffers will up because the client is not picking up messages? I guess a JVM GC pause could cause a problem as the application has no control over that, so application level problems can't be ruled out that easily, can they?
The application can't be the problem for keep alive reaching 0.
Look at these pictures:
Keepalive is handled by the respective reader.
The only problem that can arise is that you have very large messages, no limit in consumer-cache-size-kb and a small network buffer. The router-output resp. client-input buffers are extended by the given extend size (default 64 KB) and that can take long.