Our system has structured model (about 30 different entities with several kind of relations) entirely kept in memory (about 10 Gb) for performance reasons. On this model we have to do 3 kind of operation: <ol> <li>update one or a few entities</li> <li>query for a particular data (this usually require to read thousands of entities)</li> <li>get statistical data (how much memory is used, how many queries for kind etc.)</li> </ol> Currently the architecture is a fairly standard one, with a pool of threads for servlets that use a shared model. Inside the model there are a lot of concurrent collections, but still there are many waits because some entities are "hotter" and mostly of threads want to read/write them. Note also that usually queries are much more cpu and time consuming than writes. I'm studying the possibility to switch to a Disruptor architecture keeping the model in a single thread, moving everything possible (validity checks, auditing, etc.) out of the model in a separate consumers. First question of course is: does it make sense? Secondo question is: ideally write requests should take precedence over read ones. Which is the best way to have a priority in disruptor? I was thinking about 2 rings buffers and then try to read from the highpriority one more often than from the low priority one. To clarify the question is more architectural than about the actual code of LMAX Disruptor. Update with more details Data is a complex domain, with many entities (>100k) of many different types (~20) linked between them in a tree structure with many different collections. Queries usually involve traversing thousands of entities to find the correct data. Updates are frequent but quite limited like 10 entities at time, so in the whole data are not changing very much (like 20% for hour). I did some preliminar tests and it appears the speed advantages of querying the model in parallel outweigh the occasional write locks delays.

LMAX may be appropriate .. The LMAX people first implemented traditional , then they implemented actors ( with queues ) and found actors spent most of the time in the queues. Then they went to the single threaded architecture..Now the disruptor is not the key to the architecture the key is a single threaded BL. With 1 writer ( single thread) and small objects your going to get a high cache hit and no contention. To do this they have to move all long running code out of the Business layer ( which includes IO) . Now to do this they use they used the disruptor its basically just a ring buffer with a single writer as has been used in device driver code for a while but at a huge message scale. First I have one disagreement with this , LMAX is an actor system .. Where you have 1 actor for all the BL ( and the disruptors connect other actors) .. They could have improved there actor system significantly instead of jumping to 1 actor for BL , namely <ol> <li>Dont have lots of services / actors , try to have commonly used components in one service. ( this comes up time and time again in SOA / distributed systems also) </li> <li>When communicating between actors use point to point queues not many to 1. ( like all the services buses!) </li> <li>When you have point to point queues ensure the tail is a pointer to a separate memory area . With 2 and 3 you can now use lockless queues ,and the queues / threads only have 1 writer (and you can even use not temporal 256 but YMM bit writes into the queue) . However the system now has more threads (and if you have done 1 correctly a relatively small amount of messages between actors) . The queues are similar to disruptors and can batch process many entries and can use a ring buffer style. </li> </ol> With these actors you have a more modular ( and hence main-table) system ( and the system could launch more actors to process the queues - note 1 writer ! ) re your case I think 20% of changes in an hour is huge... Are the queries always on in memory objects ? Do you have in memory hash tables / indexes ? Can you use read only collections ? Does it matter if your data is old eg Ebay uses a 1 hour refresh on its items collection so the item collection itself is static. With a static collection and static item briefs , they have a static index and you can search and find items fast and all in memory . Every hour it gets rebuilt and when complete ( it could take minutes to rebuild ) the system switches to the new data. Note the items themselves are not static. IN your case with a huge domain the single thread may get a lowish cache hit ..which is different to LMAX who have a smaller domain for each message to pass over.. An agent based system may be the best bet namely because a bunch of entities can be grouped and hence have a high cache hit. But i need to know more. eg move validity checks, auditing, logging etc out is probably a good plan . Less code = smaller objects = higher cache hit and LMAX objects were small. Hope this quick dump helps but its hard from only a glance.

Should one use Disruptor (LMAX) with a big model in memory and CQRS?

Tags:

java

concurrency

disruptor-pattern

Our system has structured model (about 30 different entities with several kind of relations) entirely kept in memory (about 10 Gb) for performance reasons. On this model we have to do 3 kind of operation:

update one or a few entities
query for a particular data (this usually require to read thousands of entities)
get statistical data (how much memory is used, how many queries for kind etc.)

Currently the architecture is a fairly standard one, with a pool of threads for servlets that use a shared model. Inside the model there are a lot of concurrent collections, but still there are many waits because some entities are "hotter" and mostly of threads want to read/write them. Note also that usually queries are much more cpu and time consuming than writes.

I'm studying the possibility to switch to a Disruptor architecture keeping the model in a single thread, moving everything possible (validity checks, auditing, etc.) out of the model in a separate consumers.

First question of course is: does it make sense?

Secondo question is: ideally write requests should take precedence over read ones. Which is the best way to have a priority in disruptor? I was thinking about 2 rings buffers and then try to read from the highpriority one more often than from the low priority one.

To clarify the question is more architectural than about the actual code of LMAX Disruptor.

Update with more details

Data is a complex domain, with many entities (>100k) of many different types (~20) linked between them in a tree structure with many different collections.

Queries usually involve traversing thousands of entities to find the correct data. Updates are frequent but quite limited like 10 entities at time, so in the whole data are not changing very much (like 20% for hour).

I did some preliminar tests and it appears the speed advantages of querying the model in parallel outweigh the occasional write locks delays.

431

asked Oct 17 '12 08:10

Uberto

1 Answers

LMAX may be appropriate ..

The LMAX people first implemented traditional , then they implemented actors ( with queues ) and found actors spent most of the time in the queues. Then they went to the single threaded architecture..Now the disruptor is not the key to the architecture the key is a single threaded BL. With 1 writer ( single thread) and small objects your going to get a high cache hit and no contention. To do this they have to move all long running code out of the Business layer ( which includes IO) . Now to do this they use they used the disruptor its basically just a ring buffer with a single writer as has been used in device driver code for a while but at a huge message scale.

First I have one disagreement with this , LMAX is an actor system .. Where you have 1 actor for all the BL ( and the disruptors connect other actors) .. They could have improved there actor system significantly instead of jumping to 1 actor for BL , namely

Dont have lots of services / actors , try to have commonly used components in one service. ( this comes up time and time again in SOA / distributed systems also)
When communicating between actors use point to point queues not many to 1. ( like all the services buses!)
When you have point to point queues ensure the tail is a pointer to a separate memory area . With 2 and 3 you can now use lockless queues ,and the queues / threads only have 1 writer (and you can even use not temporal 256 but YMM bit writes into the queue) . However the system now has more threads (and if you have done 1 correctly a relatively small amount of messages between actors) . The queues are similar to disruptors and can batch process many entries and can use a ring buffer style.

With these actors you have a more modular ( and hence main-table) system ( and the system could launch more actors to process the queues - note 1 writer ! )

re your case I think 20% of changes in an hour is huge... Are the queries always on in memory objects ? Do you have in memory hash tables / indexes ? Can you use read only collections ? Does it matter if your data is old eg Ebay uses a 1 hour refresh on its items collection so the item collection itself is static. With a static collection and static item briefs , they have a static index and you can search and find items fast and all in memory . Every hour it gets rebuilt and when complete ( it could take minutes to rebuild ) the system switches to the new data. Note the items themselves are not static.

IN your case with a huge domain the single thread may get a lowish cache hit ..which is different to LMAX who have a smaller domain for each message to pass over..

An agent based system may be the best bet namely because a bunch of entities can be grouped and hence have a high cache hit. But i need to know more. eg move validity checks, auditing, logging etc out is probably a good plan . Less code = smaller objects = higher cache hit and LMAX objects were small.

Hope this quick dump helps but its hard from only a glance.

194

answered Nov 12 '22 23:11

user1496062

Related questions
                            
                                A curious way of passing a parameter to a method
                            
                                force jvm to return native memory [duplicate]
                            
                                creating reusable modules
                            
                                how to import "org.apache.http.client.HttpClient" in Eclipse?
                            
                                Any Web & Java IDE's for the ARM architecture on Linux?
                            
                                Techniques to add a REPL to a Java project
                            
                                How many runs of Java program do we need to warm-up the JVM?
                            
                                Undecorated JInternalFrame becomes decorated when viewed through Remote Desktop Sharing
                            
                                Difference between java and javaw
                            
                                creating jsf view/Component tree from the xhtml file
                            
                                Code generated by wsimport - best practice for packing the code
                            
                                CodeNameOne Dynamically created Form, how to "Back"
                            
                                Rounded buttons
                            
                                Why fill() , copy() ,reverse() and shuffle() of Collections in java is implemented this way
                            
                                Android How to calculate network usage packet/data
                            
                                jaxb marshalling skip empty elements
                            
                                Java wildcard in multi-level generic type
                            
                                Getting URL of an CXF endpoint
                            
                                Generics wildcard instantiation
                            
                                deserialize lazy loading in hibernate and jackson

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With