Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

scala actors vs threads and blocking IO

As I understand it, actors are basically lightweight threads implemented on top of threads, running many actors on a small pool of shared threads.

Given that's the case, using blocking operations in an actor blocks the underlying thread. This is not a correctness problem because the actor library will spawn more threads as necessary (is that right?) but then you end up with lots and lots of threads, negating the benefit of using actors in the first place.

Given that, how do actors work when you need to do such IO operations? Are there operations which "actor-block", suspending the actor while letting the thread go on to other operations (much as blocking operations suspend the thread while letting the CPU go on to other operations), or is everything written in CPS, with chained actors? Or are actors simply not a good fit for this sort of long-running operation?

Background: I have experience writing multithreaded stuff the classic way, and understand prettywell how CPS/event loops work, but have absolutely no experience working with actors, and just want to understand, on a high level, how they fit in, before I dive into the code.

like image 827
Li Haoyi Avatar asked Jan 24 '12 04:01

Li Haoyi


1 Answers

This is not a correctness problem because the actor library will spawn more threads as necessary (is that right?)

So far as I understand, that is not right. The actor is blocked, and sending another message to it causes that message to sit in the actors mailbox until that actor can receive it or react to the message.

In Programming in Scala (1), it explicitly states that actors should not block. If an actor needs to do something long running it should pass the work to a second actor, so that the main actor can free itself up and go read more messages from its mailbox. Once the worker has completed the work, it can signal that fact back to the main actor, which can finish doing whatever it has to do.

Since workers too have mailboxes, you will end up with several workers busily working their way through the work. If you don't have enough CPU to handle that, their queues will just get bigger and bigger. Eventually you can scale out by using remote actors. Akka might be more useful in such cases.

(1) Chapter 32.5 of Programming in Scala (Odersky, Second edition, 2010)

EDIT: I found this:

The scheduler method of the Actor trait can be overridden to return a ResizableThreadPoolScheduler, which resizes its thread pool to avoid starvation caused by actors that invoke arbitrary blocking methods.

Found it at: http://www.scala-lang.org/api/current/scala/actors/Actor.html

So, that means depending on the scheduler impl you set, perhaps the pool used to run the actors will be increased. I was wrong when I said you were wrong :-) The rest of the answer still holds true.

like image 185
Ant Kutschera Avatar answered Sep 22 '22 15:09

Ant Kutschera