Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Basics | Threaded vs Reactive concurrency model

I'm a complete newbie to the world of reactive programming. I'm looking into Akka Actors as a beginning step to play with.

My understanding of the Thread Based concurrency model is (eg of a Vanilla Servlet based model):

  1. For every request from the client, a new thread is spawned.
  2. This means the execution of all the underlying code is attached to this thread and occur serially. Apparently even if one section of the code has a bottleneck (remote Webservice calls etc), the Thread will will be blocked and wait and remain idle.
  3. The servers (containers) has a fixed threadpool to accommodate max number of concurrent threads and apparently, they will quickly run of of threads because of bottleneck.

My understanding of the Reactive concurrency model is (eg of a Akka based model):

  1. All the logic is no longer attached to a single thread and executed serially.
  2. The execution flow is reactive (ie on message, an actor gets triggered).

Now my question:

Assume the bottleneck of the remote webservice call exists in both the models. How is the actor model helps in better CPU / Core utilization? Won't we have the same problem of the execution threads within the actor getting blocked? For ex: If there are 200 actors concurrently blocked by this webservice call? Doesn't it mean there are 200 threads currently blocked? I understand that there will still be other actors reacting to other upstream events. Is this what we refer to as better utilization of the CPU?

In a Threaded model, is only the small size of the Threadpool the cause of the problem?

Doesn't the Actor Subsystem have a threadpool to spawn off a new actor and react to a specific event? If yes, then don't we have the same problem?

Pardon me if this question is completely stupid.

like image 379
user1189332 Avatar asked Mar 15 '15 23:03

user1189332


2 Answers

"Assume the bottleneck..." -- in the Reactive model you would never employ such a service, rather you would asynchronously invoke a request of a webservice, with a handler configured for the eventual response/error. If you invoke synchronous services, you are just re-introducing all of the batch-mode problems, thus compounding yours. If you could not directly employ one, you would create a proxy-like service as a pale imitation (*).

When the response arrives, the handler can unwrap whichever context is required for continuing the operation. If said context amounts to a “stack + registers”, then your framework is not very good, but is loads better than having hundreds of kernel-thread-contexts lying around to parse a message.

The formalism of having to construct a response-context should guide the solution away instituting a fight for resources, as is common in the threaded model. This is a draw, in that both good threaded and poor reactive solutions are possible, if counter intuitive.

recap: a small data structure in application space which holds the necessary information to continue an operation is a lot better than a kernel thread + user thread + stack. Not just because it uses less resources; but because it better expresses your solution, and permits the underlying framework/os/kernel/... to sequence event dispatch based upon more information than “return address”.

Naturally reactive problems should have naturally reactive solutions, just as naturally batch-oriented problems should have naturally batch-oriented solutions.

(*)= virtually all reactive frameworks are built upon traditional synchronous kernels, such as linux; reactive kernels are evolving, and the arrival of largish multiprocessors will help to mainstream these concepts.

like image 51
mevets Avatar answered Sep 19 '22 18:09

mevets


Whether or not Actor can help comes down to how the actors are implemented. If they are indeed separate threads, then actors not blocked on the Web service will, as you say, carry on executing. Thus useful work continues to be done even if the Web service is holding up some of them.

Same is true if you increase the size of the thread pool. Some of them will be doing other useful work.

Having 200 threads means you're now putting more of a burden on the underlying OS. A lot of people would react in horror. However, it's worth examining what that "burden on the OS" actually is. It's a little bit of memory, and that's about it. For the threads blocked on the Web service, they're blocked, they're not being scheduled, so they're not adding to the system's context switching burden (context switching is the performance killer; switching between a lot of threads saps a lot of time). So they do comparatively little harm to the system performance (provided you are not dynamically spawning Actors).

So in both approaches, you'd want enough threads (either Actors or thread pool threads) so that there's a reasonable number keeping the Web service 100% utilised, and enough doing other tasks to keep the local machine busy too. You'd want just enough to saturate both machines, but not to the point where the local machine is handling too many ready threads at once.

Now, every network, web service, host is different and performance is an changing thing. You'll end up writing code to dynamically control the size of the thread pool or the number of actors you're prepared to start. This can be fiddly...

like image 34
bazza Avatar answered Sep 18 '22 18:09

bazza