Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Scala stateful actor, recursive calling faster than using vars?

Sample code below. I'm a little curious why MyActor is faster than MyActor2. MyActor recursively calls process/react and keeps state in the function parameters whereas MyActor2 keeps state in vars. MyActor even has the extra overhead of tupling the state but still runs faster. I'm wondering if there is a good explanation for this or if maybe I'm doing something "wrong".

I realize the performance difference is not significant but the fact that it is there and consistent makes me curious what's going on here.

Ignoring the first two runs as warmup, I get:

MyActor: 559 511 544 529

vs.

MyActor2: 647 613 654 610

import scala.actors._

object Const {
  val NUM = 100000
  val NM1 = NUM - 1
}

trait Send[MessageType] {
  def send(msg: MessageType)
}

// Test 1 using recursive calls to maintain state

abstract class StatefulTypedActor[MessageType, StateType](val initialState: StateType) extends Actor with Send[MessageType] {
  def process(state: StateType, message: MessageType): StateType

  def act = proc(initialState)

  def send(message: MessageType) = {
    this ! message
  }

  private def proc(state: StateType) {
    react {
      case msg: MessageType => proc(process(state, msg))
    }
  }
}

object MyActor extends StatefulTypedActor[Int, (Int, Long)]((0, 0)) {
  override def process(state: (Int, Long), input: Int) = input match {
    case 0 =>
      (1, System.currentTimeMillis())
    case input: Int =>
      state match {
        case (Const.NM1, start) =>
          println((System.currentTimeMillis() - start))
          (Const.NUM, start)
        case (s, start) =>
          (s + 1, start)
      }
  }
}

// Test 2 using vars to maintain state

object MyActor2 extends Actor with Send[Int] {
  private var state = 0
  private var strt = 0: Long

  def send(message: Int) = {
    this ! message
  }

  def act =
    loop {
      react {
        case 0 =>
          state = 1
          strt = System.currentTimeMillis()
        case input: Int =>
          state match {
            case Const.NM1 =>
              println((System.currentTimeMillis() - strt))
              state += 1
            case s =>
              state += 1
          }
      }
    }
}


// main: Run testing

object TestActors {
  def main(args: Array[String]): Unit = {
    val a = MyActor
    //    val a = MyActor2
    a.start()
    testIt(a)
  }

  def testIt(a: Send[Int]) {
    for (_ <- 0 to 5) {
      for (i <- 0 to Const.NUM) {
        a send i
      }
    }
  }
}

EDIT: Based on Vasil's response, I removed the loop and tried it again. And then MyActor2 based on vars leapfrogged and now might be around 10% or so faster. So... lesson is: if you are confident that you won't end up with a stack overflowing backlog of messages, and you care to squeeze every little performance out... don't use loop and just call the act() method recursively.

Change for MyActor2:

  override def act() =
    react {
      case 0 =>
        state = 1
        strt = System.currentTimeMillis()
        act()
      case input: Int =>
        state match {
          case Const.NM1 =>
            println((System.currentTimeMillis() - strt))
            state += 1
          case s =>
            state += 1
        }
        act()
    }
like image 248
mentics Avatar asked Oct 24 '22 13:10

mentics


1 Answers

Such results are caused with the specifics of your benchmark (a lot of small messages that fill the actor's mailbox quicker than it can handle them).

Generally, the workflow of react is following:

  1. Actor scans the mailbox;
  2. If it finds a message, it schedules the execution;
  3. When the scheduling completes, or, when there're no messages in the mailbox, actor suspends (Actor.suspendException is thrown);

In the first case, when the handler finishes to process the message, execution proceeds straight to react method, and, as long as there're lots of messages in the mailbox, actor immediately schedules the next message to execute, and only after that suspends.

In the second case, loop schedules the execution of react in order to prevent a stack overflow (which might be your case with Actor #1, because tail recursion in process is not optimized), and thus, execution doesn't proceed to react immediately, as in the first case. That's where the millis are lost.


UPDATE (taken from here):

Using loop instead of recursive react effectively doubles the number of tasks that the thread pool has to execute in order to accomplish the same amount of work, which in turn makes it so any overhead in the scheduler is far more pronounced when using loop.

like image 53
Vasil Remeniuk Avatar answered Oct 27 '22 11:10

Vasil Remeniuk