What is the algorithmic time complexity of applying JMS selectors when consuming messages from a queue, with respect to queue depth n? In particular, is it linear (O(n)) per read? Is it implementation-dependent (on the JMS provider), and does it depend on what fields are being requested?
(if implementation dependent, I'm particularly interested in Websphere MQ and Solace's behaviour, but I welcome answers that deal with any particular JMS provider, especially if you have links to documentation describing the complexity!).
Motivation: each message has two properties: an invocationID
and a batchName
. A batch consists of several invocations. Clients wish to consume messages in one of two ways; either by invocationID
or by batchName
. At the point that messages are produced, I don't know by which method they will be consumed.
This can be implemented through selectors:
invocationID=42
Or
batchName="reconciliation"
...and I can speed one of these up by using the correlation ID instead of a custom property, but am concerned that the other will remain slow.
According to the docs, the messages are searched sequentially. WMQ does however index the MessageID
and CorrelID
fields. The Infocenter describes the behavior as follows:
Selecting messages from a queue requires WebSphere MQ to sequentially inspect each message on the queue. Messages are inspected until a message is found that matches the selection criteria or there are no more messages to examine. Therefore, messaging performance suffers if message selection is used on deep queues.
To optimize message selection on deep queues when selection is based on JMSCorrelationID or JMSMessageID, use a selection string of the form JMSCorrelationID = ... or JMSMessageID = ... and reference only one property.
This method offers a significant improvement in performance for selection on JMSCorrelationID and offers a marginal performance improvement for JMSMessageID.
I would love to understand more about the requirement to multiplex queues. A complex selector is going to impact performance on anyone's implementation and the alternative of using multiple open handles with simpler selectors is no different to the app code than using multiple queues. For WMQ of course, dynamic queues or many permanently defined queues is no problem at all. Very often when I see this requirement, it comes from shops that have used certain other transports where performance takes a severe dive with many queues defined and there is an assumption that this is true on WMQ as well. In other cases the requirement has been met with Pub/Sub and durable subscriptions. I'm not suggesting there are no valid cases for this requirement, just wondering what is driving it.
It all depends on the implementation. A lot of JMS providers store messages in a SQL database so they can use SQL for selector implementation. In this case you would have to look how your particular case is mapped into SQL.
As for WebSphereMQ - the selector implementation is O(log n) for JMSMessageID = sth
and JMSCorrelationID = sth
, for the others I have no specific knowledge. From experience it looks like O(n) though.
With WebSphere MQ version 7 the implementation of selectors was altered. With a v7 JMS client and v7 QueueManager, the selection processing is done QueueManager side. With a v6 JMS Client (or in fact a v7 client working in it's migration) mode, all messages are flowed across to the client to processing. If the hit rate of a matching message was low there was a lot of wasted effort. So
With v7 the processing is done QueueManager side so only messages that match are sent to the client.
Keep in mind that the QueueManager doesn't maintain complex indexes of message properties as a database would. So the simpler selectors are the better.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With