Behavior with expired message scheduling

classic Classic list List threaded Threaded
27 messages Options
12
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Behavior with expired message scheduling

life
We are testing Scheduler Swiftlet to delay messages for retry purpose, but it is found that the processing rate is quite slow, and consequently some messages are lost (removed by Scheduler Swiftlet due to expired scheduling).

As read from the following 2 threads:

Sluggish response from sys$scheduler

Message Schedules gets deleted

Some configuration tuning is possible to improve the performance, but message lost is still possible if the processing rate (based on JMS message selector) is not fast enough comparing with the incoming rate or the total number of messages to be processed, which make me wonder about the behavior with expired message scheduling.

From user/application point of view, the message is expected to delay for some time then it should be released to the actual output queue, so if the message scheduling is expired (due to wrong calculation or slow processing rate), should the expected behavior be to send message to actual output queue immediately (instead of removing that message, which would be considered as message lost from user/application of view)?

FYI, we do not set the property "JMS_SWIFTMQ_SCHEDULER_EXPIRATION", which I think that should be the only reason message should be removed from queue (if the value is set).

Anyway, right now we are using 9.4.2 for dev testing, 9.6/9.7 for production, is there anyway not to delete those messages with expired scheduling? Or maybe code change is needed to introduce some new property to indicate whether message with expired scheduling should be removed?

Please advise. Thanks a lot!
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Behavior with expired message scheduling

IIT Software
Administrator
One of the threads you mentioned give some important hints about improving the scheduler's performance. They were able to handle 5K schedules at the end.

JMS_SWIFTMQ_SCHEDULER_DATE_TO is when the schedule is expired and administratively removed (you see it in the log). You can set it to "forever".

JMS_SWIFTMQ_SCHEDULER_EXPIRATION is the final message expiration set on the message when it is enqueued in the final destination.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Behavior with expired message scheduling

life
Thanks for the prompt reply!

As tried just now, setting "JMS_SWIFTMQ_SCHEDULER_DATE_TO" to "forever" would result in schedule update to next day (since the scheduled time is expired/older comparing with the current time), refer to the following extract (setting expired schedule time for simulation):

1) JMS properties set by application when sending message to queue "swiftmqscheduler" (log time as 2017-04-03 08:57:23,924)

-- JMS_SWIFTMQ_SCHEDULER_COMMAND=add
-- JMS_SWIFTMQ_SCHEDULER_DATE_FROM=2017-04-03
-- JMS_SWIFTMQ_SCHEDULER_DATE_TO=forever
-- JMS_SWIFTMQ_SCHEDULER_DESTINATION=TEST_JMS2JMS_SIMPLE_OUT
-- JMS_SWIFTMQ_SCHEDULER_DESTINATION_TYPE=queue
-- JMS_SWIFTMQ_SCHEDULER_ENABLE_LOGGING=true
-- JMS_SWIFTMQ_SCHEDULER_MAY_EXPIRE_WHILE_ROUTER_DOWN=false
-- JMS_SWIFTMQ_SCHEDULER_TIME_EXPRESSION=at 08:57:13


2) message dump via CLI view (queue "sys$scheduler")

<?xml version="1.0" encoding="UTF-8"?>
<result>
  <message index="4" message-key="8014" locked="false" size="876" type="TextMessage">
    <jms-header>
      <JMSDeliveryMode>PERSISTENT</JMSDeliveryMode>
      ... ...
      <JMSTimestamp>03 Apr 2017 08:57:23.924 +0000 (1491209843924)</JMSTimestamp>
    </jms-header>

    <jms-vendor-properties>
      <property name="JMS_SWIFTMQ_SCHEDULER_COMMAND" type="java.lang.String" value="add"/>
      <property name="JMS_SWIFTMQ_SCHEDULER_DATE_FROM" type="java.lang.String" value="2017-04-03"/>
      <property name="JMS_SWIFTMQ_SCHEDULER_DATE_TO" type="java.lang.String" value="forever"/>
      <property name="JMS_SWIFTMQ_SCHEDULER_DESTINATION" type="java.lang.String" value="TEST_JMS2JMS_SIMPLE_OUT"/>
      <property name="JMS_SWIFTMQ_SCHEDULER_DESTINATION_TYPE" type="java.lang.String" value="queue"/>
      <property name="JMS_SWIFTMQ_SCHEDULER_ENABLE_LOGGING" type="java.lang.Boolean" value="true"/>
      <property name="JMS_SWIFTMQ_SCHEDULER_ID" type="java.lang.String" value="1491209843925-8018"/>
      <property name="JMS_SWIFTMQ_SCHEDULER_MAY_EXPIRE_WHILE_ROUTER_DOWN" type="java.lang.Boolean" value="false"/>
      <property name="JMS_SWIFTMQ_SCHEDULER_TIME_EXPRESSION" type="java.lang.String" value="at 08:57:13"/>
    </jms-vendor-properties>



3) CLI output with "lc sys$scheduler/usage/active-message-schedules" for the above ID of "1491209843925-8018"

Properties for this Entity:

Name                                    Current Value
--------------------------------------------------------------
next-start (R/O)                        2017-04-04 08:57:13
next-stop (R/O)                         <not set>
schedule-calendar (R/O)                 <not set>
schedule-date-from (R/O)                2017-04-03
schedule-date-to (R/O)                  forever
schedule-logging-enabled (R/O)          true
schedule-time-expression (R/O)          at 08:57:13
state (R/O)                             SCHEDULED

Sub-Entities of this Entity:
----------------------------
history



The requirement for our case would be to release such message to the output queue immediately (since the schedule is expired or the message has been delayed), so that the application can re-process the message (as part of retry with delay). Is that possible?


Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Behavior with expired message scheduling

IIT Software
Administrator
Yeah, sorry. With "forever" it repeats every day so this will only work if you programmatically remove the schedule.

We can add fire when exipire in the next release.

But what is the problem you see? Will it expire before it fires? Or is just that you must have a delivery guarantee?
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Behavior with expired message scheduling

life
Basically we are trying to make use of Scheduler Swiftlet to implement retry with delay, which looks like some elegant yet simple solution based on queue, comparing with developing other applications with/without database:

- message that need to be retried shall be delayed then send to output queue (application will send those messages to scheduler input queue)

- such message should never be removed by scheduler, even with expired scheduling (e.g. due to slow processing with scheduler queuing), while slightly longer delay is fine

- application will handle the output queue and decide when to delete the message


Based on our testing and finding from this forum, scheduler system threads will process message in the input queue "swiftmqscheduler":

- if no error, then some scheduling job will be created, with message moved to internal queue "sys$scheduler", which will be handled by scheduler job threads later

- if error is found or expired scheduling is found, the message would be removed, which would means message lost for application


Based on our use case, it would be good to support the following behavior (e.g. via new property) for scheduler system threads:

1) if expired scheduling is allowed, such message should be moved to output queue immediately (new behavior)

2) if expired scheduling is not allowed, such message should be removed immediately (existing behavior, which could be the default behavior for backward compatibility)
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Behavior with expired message scheduling

IIT Software
Administrator
life wrote
1) if expired scheduling is allowed, such message should be moved to output queue immediately (new behavior)
We can add this via a new property in the message schedule for the next release. But this may take a while as we are working on the next major release. When do you plan to use this?
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Behavior with expired message scheduling

life
We are testing now to see whether it is possible to use Scheduler Swiftlet for supporting application retry with delay, so we would need it as soon as possible, but production upgrade to latest version or patched version (I think we do have support license) might be some issue (since it is not really critical issue to production operation, rather some application features).

If the existing versions is working with some acceptable limitation, it would be fine and we can live with it while waiting for the new release, otherwise we would have to try other solution.

Just trying to understand where message are removed (please correct me if my understanding is wrong), with the following 2 steps processing with Scheduler Swiftlet:
 
Step 1) scheduler system threads will process messages in the input queue "swiftmqscheduler", and messages without error will be moved to internal queue "sys$scheduler" with scheduler job created, while messages with error (including invalid or expired message scheduling) will be removed

Step 2) scheduler job threads will be created later to process those scheduler jobs with the internal queue "sys$scheduler"


Initially I thought message will be only removed in the above Step 1, so I have tried to disable flow-control with internal queue "sys$scheduler", so all message can be moved to internal queue "sys$scheduler" as fast as possible:

- this is working fine with batch of 1000 messages (3 batch, all with delay of 1 minute, and batch delay of 1 minute) ==> output queue have 3000 messages

- this is not working with batch of 2000 messages (3 batch, all with delay of 1 minute, and batch delay of 1 minute) ==> output queue have only 2000+ messages, while the other 3000+ messages are removed in the above Step 2


Any idea about why messages are removed in the Step 2? Or it is expected, and software enhancement is needed (e.g. new property for expired message scheduling) not to remove messages with expired scheduling in both Step 1 and Step 2?

If yes, is patch release possible for our testing (based on Version 9.4.2) and how long would it take? Thanks!
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Behavior with expired message scheduling

IIT Software
Administrator
So you send 3 batches of each 1000 messages to queue "swiftmqscheduler" each containing a schedule with 1 minute?

When such a message schedule arrives, the scheduler will determine the next start time of this message job. If the next start time is < current time, the message will expire. There is quite a bit work to do for each schedule so it might be that your delay is too low. Can you test it with a greater delay, say 3 minutes? Does it make a difference?
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Behavior with expired message scheduling

life
As mentioned, "the scheduler will determine the next start time of this message job", I assume this is done when message is picked up from the input queue, right?

Actually the problem is not with the schedule delay time, rather with the load, e.g. if the input rate is faster than the processing rate or when there are queuing with the internal queue:

- the actual application would need to support smaller delay, e.g. 30 seconds

- there is no issue to send 1 batch of 5000 messages with schedule delay of 1 minute, and all messages would be saved to output queue, although the last message would have a few minutes' delay (due to slow processing rate by scheduler)

- the problem comes when there are multiple batches, kind of simulation for application retry messages with delay: the 2nd batch comes before all of the message from 1st batch have been processed, then some messages might be removed


With disabling of flow-control with the internal queue, I was expecting all messages could be processed quickly (otherwise with queuing internal queue plus slow processing rate, some message would have expired schedule time when being picked up by scheduler from input queue), then moved to internal queue with the corresponding job created, then those message in the internal queue can be processed slowly to move message to the actual output queue.

But based on my testing with 3 batches of 2000 messages, it looks like somehow the scheduler stops creating scheduler job, although all the messages were moved to internal queue, then later those messages without scheduler job would be removed. Any idea about why?
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Behavior with expired message scheduling

IIT Software
Administrator
Well, I checked the code. I assume you've exhausted the capacity of the scheduler. For example, there is an entry created in the management tree (and propagated to all running Explorers) for each message schedule. Look here, end of page:

http://www.swiftmq.com/products/router/swiftlets/sys_scheduler/messagejobs/index.html

For each message schedule there is also a timer created (consuming from timer.tasks thread pool).

What about a different approach? For example, send your delayed messages to another queue (called it "delayqueue") and use the Queue Mover Job to send it to the original queue:

http://www.swiftmq.com/products/router/swiftlets/sys_queuemanager/jobs/index.html

Schedule the job to run in intervals of 30 secs, set a property on each message with the target queue name and use a message selector on that to select messages for the target queue. Schedule one Queue Mover Job for each different target queue but the same source queue "delayqueue".

Should work IMO.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Behavior with expired message scheduling

life
Thanks for the explanation and suggestion!

Yeah, based on the router info log, it is noticed there were 1-2 minutes where no message job were started, which would trigger the timer expired and consequently those messages are removed (although alternative handling would be to release those messages to the actual output queue, since from application of view, those message have been delayed already and it is up to the application to decide what to do with those messages).

For the suggestion about queue mover job, it would be useful for case when all messages have fixed delay, but NOT for the case dynamic delay is needed per message, and actually we have been trying similar solution managed by application, but it is difficult to support messages with different delay intervals.

BTW, it is noticed that the scheduler processing rate might be too slow under load:

1) testing 1 batch of 1000 messages with delay of 1 minute ==> OK
- all messages are scheduled and sent to the actual output queue
- the overall throughput is about 100 msg/s, so it took about 10 seconds to process all messages
- no message lost while some message are actually being delayed for 70 seconds

2) testing 1 batch of 2000 messages with delay of 1 minute ==> OK
- all messages are scheduled and sent to the actual output queue
- throughput is about 50 msg/s, so it took about 40 seconds to process all messages
- no message lost while some message are actually being delayed for 90+ seconds

3) testing 1 batch of 5000 messages with delay of 1 minute ==> NOK
- all messages are scheduled and sent to the actual output queue
- throughput is about 20 msg/s, so it took about 4 minutes to process all messages
- no message lost but some message are actually being delayed for 4+ minutes

4) testing 1 batch of 10000 message with delay of 1 minute ==> NOK
- all messages are scheduled and sent to the actual output queue
- throughput is about 10 msg/s, so it took about 16 minutes to delay and process all messages
- no message lost but some message are actually being delayed for 10+ minutes

This might the limitation of using queue plus message selector, the new property solution (to send message with expired scheduling to output queue immediately instead of removing message) would help to avoid message lost, but might not help much on the overall throughput. What do you think?

If the processing rate under load could not be improved, we might need to consider other solutions.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Behavior with expired message scheduling

IIT Software
Administrator
Can you change max-threads="-1" (unlimited) for thread pool "timer.task" and repeat your test? Please watch the pool how many running threads you have.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Behavior with expired message scheduling

IIT Software
Administrator
In reply to this post by life
The message job scheduling has its limits. I don't think it is the selector but the timer.tasks pool which has max-threads="3" by default. These tasks execute ALL timers in the router.

Also, all message jobs are registered in the Usage section of the Scheduler Swiftlet to enable admins to remove them. Thousands of entries is certainly not usable.

Anyway, we will provide an alternative way to schedule message jobs by providing a SwiftMQ Streams admin stream (part of a admin stream package) to process these schedules. This is a much more elegant way and it scales!
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Behavior with expired message scheduling

life
In reply to this post by IIT Software
Ok, tried to set max-threads as 100 (setting "-1" will get error "max-threads must be greater or equal to min-threads", since both max-threads and min-threads are 3 by default) and redo the following test:

3)  testing 1 batch of 5000 messages with delay of 1 minute

the result is similar to my previous test ==> throughput is still about 20 msg/s, so it took about 4 minutes to process all messages


Pool usage from CLI (lc sys$threadpool/usage/timer.tasks):

Entity:      Thread Pool
Description: Active Threadpool

Properties for this Entity:

Name                                    Current Value
--------------------------------------------------------------
idling-threads (R/O)                    38
running-threads (R/O)                   1

Entity contains no Sub-Entities.


Running the same CLI a few times, and sometimes running-threads would be 2.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Behavior with expired message scheduling

IIT Software
Administrator
5000 schedules with 1m delay would fire all 5000 schedules at the same time. Is that realistic? I think you will have better throughput if you spread your delays which seems more realistic, e.g. every 100 schedules with the same delay.

Keep in mind that your app doesn't need to redeliver everything what comes in if some resource is unavailable. You could schedule redelivery for the current transaction and call connection.stop() so all receivers will stop delivery until you call connection.start() timer/event driven when the resource comes up again.

I admit the message scheduler doesn't scale if you pound it this way. Since it wasn't the timer.tasks pool, it is certainly the number of selectors (5000) for this single queue. Messages aren't lost so that means the schedule is not expired but the jobs were activated.

I cannot patch it in any way. Other solutions like a new Queue Manager Job or a JMS App deployed in a JAC container will all end up in using selectors. I assume, however, that you will get much better throughput if you spread and/or use connection.start/stop. So this will at least a working solution. As already mentioned, we will provide a SwiftMQ Stream in the next release that does this message schedule job far better and scales. But you need to upgrade then.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Behavior with expired message scheduling

life
Yes, practically we might not have 5000 messages to be scheduled at the same time, but it is possible to have a lot of messages to be scheduled/processed. This various simulation tests are just load test to understand how Scheduler Swiftlet works and the possible issues, as my colleague is actually working/testing with real application, who did encounter similar issues (mainly message lost).

Yes, actual production usage could load balance the delay messages to multiple routers, which would reduce the load per router.

Nevertheless I do see one possible improvement, regarding the possible new property about expired message scheduling (or anything wrong with the message), especially from application point of view:

- whatever error/exception happens on Scheduler Swiftlet, the scheduler should not delete the message, rather it should send the message to the specified output queue (unless invalid format or output queue does not exist) and let the application decide what to do with that message (delay again or delete)

- this could be the default behavior, or alternative behavior based on application request (new property)


Thanks for the pointer about SwiftMQ Stream! But as our production is still with Version 9.6/9.7, it might not be feasible at this moment to implement application logic based on that.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Behavior with expired message scheduling

IIT Software
Administrator
It is not as easy as to add another property and insert an if-statement. A message schedule is received from the input queue of the scheduler and then converted into a "real" schedule just like as if you create a job schedule via Explorer. This schedule then goes through some stages where at the end the next job start is calculated from your date from/to and time expression. If there is no next job start, the schedule will be removed if it is a message schedule.

Wouldn't it be possible to at least set the delay so high that the job is executed without expiring?

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Behavior with expired message scheduling

IIT Software
Administrator
In reply to this post by life
We strive to make the message scheduler stream compatible with the properties of the message schedules. So all you would need to change once you upgrade is to change the queue name.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Behavior with expired message scheduling

life
In reply to this post by IIT Software
Understand, as sometimes user requirement might have conflict with software design, and it might be difficult to change the software.

Anyway from my point, since all messages will be deleted/moved either when processing in the input queue (due to error or expired scheduling) or processing in the internal queue (due to scheduling error or completed scheduling), it should be possible to decide what to do with that message, i.e. either delete immediately or move to the specified output queue (or copy then delete).

We are looking for some general solution for application with delay requirement, which should be able to handle different delay interval (e.g. 30 seconds, 1 minute, 5 minute, etc.). It is OK that the actual delay is longer than scheduled/expected but message lost should be avoided in all cases.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Behavior with expired message scheduling

life
In reply to this post by IIT Software
Ok, maybe we can look into it in the future.

BTW, while browsing through the documentation with Streams Swiftlet, I did not see any message schedule example for JMS Client similar to Scheduler Swiftlet. Is it still ongoing or the usage with Streams Swiftlet is different?
12
Loading...