I know this will be hard to answer, but maybe you can make an educated guess.
We are operating a router network with three datacenters.
A producer at our local site, connected to the local SwiftMQ router encountered this error:
Feb 24 15:02:37 error publishing message: MessagingException: Can't retrieve destination 'App_Inbox@R1_HA' from JNDI.
Feb 24 15:02:37 disconnected from queue [App_Inbox@R1_HA]
R1 being a remote location.
I checked the router logfiles, neither the local nor the remote router showed any errors.
In SwiftMQ Explorer the router network was available and I could manage the remote router. I checked the JNDI Swiftlet for aliases, looked fine to me. Being out of sensible ideas I performed a failover on the remote router. After some time the local producer was able to deliver the messages again, so issue resolved. But what could that have been?
Ok, I didn't know about that, we just have static routes in place.
But if I had a static remote queue the local router would have accepted the messages and then? Something was broken, and it was broken for hours, although the router was available in the network, administratable via Explorer from a remote router and there were no other visible issues apart from that one queue that failed to be looked up via JNDI.
I assume the messages would maybe have been queued in rt$RT1, which we are not monitoring. Would there have been any way to notice the issue?
Maybe it would be a good idea to start monitoring rt$remoterouterX, there is a lot of traffic there, but anyway it should contain 0 messages most of the time. Sadly I failed to check if there were any other messages queued in rt$RT1, but I didn't think of that as everything else seemed to work within the router network.
These static remote queues are not real queues but only javax.jms.Queue objects so that you can do a JNDI lookup while a remote router is not connected. Only your lookup succeeds and you have a static route, messages go to the routing queue for the remote router, store and forward.
I have never set up the static remote queue (routers were always there!) in the past, I must admit, and now we have this issue once again. As a matter of fact there seems to be no problem with the remote router. Everything works just fine, just the JNDI is having issues. Also this seems to be isolated to a single client, but I am not entirely sure about that. It's just the way it seems as I can only find errors there. Any other ideas or would maybe a thread dump help?
Same as in the past:
MessagingException: Can't retrieve destination 'xxx@routerY' from JNDI.
I have resolved the issue now, as I updated to 10.2.0 I had to perform several failovers during the rolling upgrade and now it's ok. But while I did that another client, this time a hot deployed JMS app, had the same problem with another remote router.
It is a 3 router network, all connected with each other, but one router sort of sits in the middle. Let's say routerX is in the middle and connects to router Y and Z, remote queues on Y failed for connections to routerX, but just for 1 client initially. After routerX was updated/performed a failover, another client got the same error for remote queues on router Z. In the end, after all routers were updated and switched back and forth, all remote queues work.
Sorry, I was too fast. They did work for a while, but now I am seeing the errors again... For the client that initially had the problem.
As we continue to suffer from a bad connection this could be the case. Sometimes it works, then it doesn't. Also an entire cold start of the router network did not resolve the issue. How can I increase the lookup time? Is it a server/listener or a client setting?
EDIT: I am still curious how to increase the timeout, but it seems by removing the infinite loop in the rt$ queue I got rid of the delay that caused the issue. Thanks for your input, I probably wouldn't have found it otherwise.