Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is there a way to ensure that threads are assigned to a specified set of objects?

We are working on an application where a set of objects can be affected by receiving messages from 3 different sources. Each message (from any of the sources) has a single object as its target. Each message receiver will be running on its own thread.

We want the processing of the messages (after receiving), to be as high-speed as possible, so the message processing against the target objects will be done with another thread from a thread pool. The processing of the message will take longer than the reading/receiving of the messages from the senders.

I am thinking that it will be faster if each thread from the pool is dedicated only to a particular set of objects, for example:

Thread1 -> objects named A-L
Thread2 -> objects named M-Z

with each set of objects (or Thread) having a dedicated queue of messages to pending being processed.

My assumption is that if the only thread synchronization needed is between each receiving thread and one processing thread, for the duration of time that it needs to put the message on a blocking queue, that it will be faster than randomly assigning worker threads to process the messages (in which case there might be 2 different threads with messages for the same object).

My question is really 2 parts:

  1. Do people agree with the assumption that dedicating worker threads to a particular set of objects is a better/faster approach?

  2. Assuming this is a better approach, do the existing Java ThreadPool classes have a way to support this? Or does it require us coding our own ThreadPool implementation?

Thanks for any advice that you can offer.

like image 855
Sam Goldberg Avatar asked Nov 12 '12 17:11

Sam Goldberg


2 Answers

[Is] dedicating worker threads to a particular set of objects is a better/faster approach?

I assume the overall goals is to trying to maximize the concurrent processing of these inbound messages. You have receivers from the 3 sources, that need to put the messages in a pool that will be optimally handled. Because messages from any of the 3 sources may deal with the same target object which cannot be processed simultaneously, you want someway to divide up your messages so they can be processed concurrently but only if they are guaranteed to not refer to the same target object.

I would implement the hashCode() method on your target object (maybe just name.hashCode()) and then use the value to put the objects into an array of BlockingQueues, each with a single thread consuming them. Using an array of Executors.newSingleThreadExecutor() would be fine. Mod the hash value mode by the number of queues and put it in that queue. You will need to pre-define the number of processors to maximum. Depends on how CPU intensive the processing is.

So something like the following code should work:

 private static final int NUM_PROCESSING_QUEUES = 6;
 ...
 ExecutorService[] pools = new ExecutorService[NUM_PROCESSING_QUEUES];
 for (int i = 0; i < pools.length; i++) {
    pools[i] = Executors.newSingleThreadExecutor();
 }
 ...
 // receiver loop:
 while (true) {
    Message message = receiveMessage();
    int hash = Math.abs(message.hashCode());
    // put each message in the appropriate pool based on its hash
    // this assumes message is runnable
    pools[hash % pools.length].submit(message);
 }

One of the benefits of this mechanism is that you may be able to limit the synchronization about the target objects. You know that the same target object will only be updated by a single thread.

Do people agree with the assumption that dedicating worker threads to a particular set of objects is a better/faster approach?

Yes. That seems the right way to get optimal concurrency.

Assuming this is a better approach, do the existing Java ThreadPool classes have a way to support this? Or does it require us coding our own ThreadPool implementation?

I don't know of any thread-pool which accomplishes this. I would not write your own implementation however. Just use them like the code outlines above.

like image 157
Gray Avatar answered Nov 13 '22 13:11

Gray


In general, approaches like this are a bad idea. It falls under the "don't optimize early" mantra.

Further, if implemented your idea may harm your performance, not help it. One simple example of where it wouldn't work well is if you suddenly got a lot of requests on one type - the other worker thread would be idle.

The best approach is to use a standard producer-consumer pattern and tune the number of consumer threads by system testing under various loads - ideally by feeding in a recording of real-life transactions.

The "go to" framework for these situation are classes from the java.util.concurrent package. I recommend using a BlockingQueue (proably an ArrayBlockingQueue) with an ExecutorService created from one of the Executors factory methods, probably newCachedThreadPool().


Once you have implemented and system tested that, if you find proven performance problems, then analyse your system, find the bottleneck and fix it.

The reason you shouldn't optimize early is that most times the problems are not where you expect them to be

like image 32
Bohemian Avatar answered Nov 13 '22 12:11

Bohemian