We're still getting locked messages (albeit very rarely) and there is no obvious reason for them (we have read all previous forum messages about locked messages, restarted all consumers and checked the XA Resource Manager).
Restarting the router is not an activity we take lightly but this does remove the locked messages. However, I am keen to find out exactly why they are locked (without the restart if possible).
The HA Router comes with the JMS XA/ASF Swiftlet - does this mean that by default we will be using XA or have any other impact on locked messages? If so, would we not expect to see any locked messages in the XA Resource Manager?
Only one queue currently appears to suffer from this issue and the consumer is a Apache Camel application using the following spring configuration:
The JMS XA/ASF Swiftlet has only the XA feature but only uses it if you use it from your app.
Concerning locked messages:
When a consumer is created (a JMS MessageConsumer etc) and either the first time a receive() is called or a MessageListener is set, a request is sent to the router to fill this consumer's client side cache. Further receives or listener invocations are then served out of this client side cache. Default size is 500 messages or 2 MB. If the cache is near empty, it is filled in the background.
All messages in the client side cache are covered by an internal read transaction which locks the messages until either these are consumed or the transaction is rolled back. The latter unlocks the messages and other consumers can have them.
If MessageConsumer.close() is called, the read transaction which hold the locks is rolled back and thus, unlocks all messages.
Concerning our spring support library:
It adds a layer for pooling sessions, producers and consumers from a single shared connection. If a consumer from a pooled session is created, it either checks its internal consumer pool and returns a free one which has the same queue name or it creates a new pooled consumer. The app uses it just like a normal MessageConsumer. The difference is in the close() method which just adds it to the consumer pool.
There is a clean up thread in the lib which checks all pools (session, producer, consumer) on expiration which is the case after 60 secs non-use. The consumer will then be deleted from the pool and the regular close() method of the consumer is called which then leads to rollback of the read transaction and unlock of all messages of this particular consumer.
I assume that Camel or Spring uses a scheme to create a consumer, read one message and closes it. Something like this. If multiple threads are used it may lead to multiple consumers for a single queue which are checked in and out of the pool without ever reaching the expiration time. However, if multiple threads are being used, a dedicated JMS session must be used for each thread and thus only a single consumer can be active for each dedicated queue.
So actually I'm not sure what's going on there. Under standard JMS spec usage (1 thread, dedicated session), you will always use the same consumer for a queue/topic and the messages must be unlocked 60 secs (expiration time) after the consumer's close method has been called. So either the close() method isn't called or multiple threads use the same session which is forbidden.
Hi - thanks for the detailed response - I am attempting to reproduce this issue so that I can prove it one way or the other. The Camel documentation states that the default configuration is to use a single consumer.
Putting the root cause aside for a moment - is there any way that a feature could be added to the message router to allow us to manually either commit or rollback the locked messages so that they can be either cleared or made available for consumption again?
One way to check whether there is a consumer on this queue is via Explorer. You need to disable "smart tree" under Router Environment. Unfortunately this requires a restart. Thereafter you can drill down Queue Manager / Usage and see each consumer on this queue.
and the ConnectionFactory has the default settings.
We do not see the log line that should be observed immediately upon receipt of the message. I wondered if the 10,000 second timeout on the SMQP url may be causing an issue but have not been able to replicate the issue in the test environment as yet.
Sure - I've attached a tarball containing a gradle project that just needs the swift libs in the lib folder and the queue name and jndi settings configured in config.properties. It should consume messages from the queue and create files in the data/queued folder.