We are actually facing a locking messages issue in our swiftmq router.
On the consumer side, some messages seems not to reach our code and on the router side (throws the explorer console), they remain locked until the router reboots.
Our implementation is based on Spring (MessageDrivenPojo) using :
- org.springframework.jms.connection.JmsTransactionManager transaction manager
- the connectionFactory implementation provided by SwiftQM
- sessionTransacted = true
- no XA
When the problem occurs, we couldn't see neither pending XA transaction on the router nor any open connection in the network swiftlet.
We have created a test case and we can reproduce the problem (10-20% of the times) in the following conditions :
- 1 sender of 200 messages (1msg / 5 ms)
- 1 receiver (1 msg / 10 ms)
During our investigations, we have enabled the SwiftMQ tracking message logs and we have recorded them during both passing and blocking scenarios. Comparing these logs, we have noticed a difference in the number of call to com.swiftmq.jms.v750.MessageConsumerImpl.addToCache method.
We can provide you the recorded logs of the 2 scenarios :
- 200 sent messages -> 200 consumed messages (18 102 calls to addToCache)
- 200 sent messages -> 94 consumed messages and 106 locked messages (3 163 calls to addToCache)
In our analysis of the blocking scenario we have noticed that :
1- the first messages from #143 to #150 are successfully received
2- the messages from #151 to #216 remain blocked and addCache method is not correctly called on the session which handle them
3- the messages from #218 to #226 are successfully received
4- the messages from #228 to #281 also remain blocked
5- the last messages from #283 to #368 are processed normally and successfully received.
We also have recorded thread dumps and heap dumps of both scenarios.
Can you help us investigate further to understand what is happening ?
Before going into details, can you please use our Spring Support lib? Description is here. This lib ensures that all JMS calls (especially concurrent access) are done properly. Please let me know if the problem remains.
Thanks for your answer.
We have integrated the Spring Support library as described on your site. We no longer see problems regarding locked messages.
However, the documentation you pointed (JMS Pooling for Spring's JmsTemplate) describes an example with a unique singleSharedConnectionFactory and you write
"SingleSharedConnectionFactory uses a single shared JMS connection.
This single connection provides pooling for JMS sessions where each JMS session pools producers and consumers."
In our infrastructure, we use hundreds of producers and consumers distributed in many webapps (more than 50). Most consumers are actually subscriptions to topics with a defined clientID.
From our perspective, it seems we must define one singleSharedConnectionFactory for each consumer.
After a long period during which we haven't encountered any problems around the JMS activity, we
had two incidents around JMS consumption. In each case, there where messages available on the routeur but the consumer wasn't consuming them.
Restarting the application containing the consumer helped to clear the deadlock.
However we managed to reproduce the blocking phenomenon on a test environment and made the following observations:
1) The subscription to the topic is supplied with 10000 messages
2) The consumer starts to process messages then stop unexpectedly
3) The application traces indicate that 984 messages were processed
4) On the router side, in sys$jms/usage the corresponding session indicates a total-received of 1000 (twice the smqp-consumer-cache-size value)
5) On the router side, in sys$queuemanager/usage a view command on the subscription shows that the first 15 messages are locked
Beyond the workaround, we would like to go further into the analysis of the blocked messages.
Could you tell us in what direction to continue our investigations ?
We use the last version of SwiftMQ router (9.7.3) and client application is embedded with the SwifMQ library from the same version.
This is certainly an application problem and you need to check/debug your app.
Messages are locked when they are delivered to the client side cache. The app receives from this cache and unlocks them. So either it doesn't receive at all or you have transacted sessions and do not perform a commit after receive.
Another possibility is if you are using a session/receiver from multiple threads (consume concurrently). This is not allowed in JMS and can lead to locks that are never released.
The problem occurs because of a concurrency access to the session state in com.swiftmq.jms.v750.SessionImpl $ SessionDeliveryQueue.isStarted() between the thread that feeds the cache via the method
com.swiftmq.jms.v750.SessionImpl$SessionDeliveryQueue.process() and the consumer thread that makes a com.swiftmq.jms.v750.ConnectionImpl.stop()
The application finds itself in a situation where during the filling of the cache, the session.isStarted() method returns false and the feed process exits during filling.
Subsequently consumption stops when the local cache becomes empty.
The application in which the problem occurred is an old application that uses Spring v2.0.0.
From version 2.0.3 of Spring framework, the code has changed and between each receive, there is no more call the start and stop on the connection.
We are going to test upgrading Spring version but if our analysis is correct, we shouldn't encounter the blocking problem anymore.