for my current development I have many threads (Producers
) that create Tasks
and many threads that that consume these Tasks
(consumers
)
Each Producers
is identified by a unique name; A Tasks
is made of:
Producers
My question concerns the data structure used by the (Producers
) and the (consumers
).
Naively, we could imagine that Producers
populate a concurrent-queue with Tasks
and (consumers
) reads/consumes the Tasks
stored in the concurrent-queue.
I think that this solution would rather well scale but one single case is problematic: If a Producers
creates very quickly two Tasks
having the same name but not the same data (Both tasks T1 and T2 have the same name but T1 has data D1 and T2 has data D2), it is theoretically possible that they are consumed in the order T2 then T1!
Now, I imagine creating my own data structure (let's say MyQueue
) based on Map + Queue. Such as a queue, it would have a pop()
and a push()
method.
pop()
method would be quite simplepush()
method would:
Task
is not yet inserted in MyQueue
(doing find()
in the Map)
Task
to-be-inserted would be merged with data stored in the found Task
Task
would be inserted in the Map and an entry would be added in the QueueOf course, I'll have to make it safe for concurrent access... and that will certainly be my problem; I am almost sure that this solution won't scale.
So my question is now what are the best data structure I have to use in order to fulfill my requirements
You could try Heinz Kabutz's Striped Executor Service a possible candidate.
This magical thread pool would ensure that all Runnables with the same stripeClass would be executed in the order they were submitted, but StripedRunners with different stripedClasses could still execute independently.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With