Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What is the fastest way to subtract two arrays in scala

I have two arrays (that i have pulled out of a matrix (Array[Array[Int]]) and I need to subtract one from the other.

At the moment I am using this method however, when I profile it, it is the bottleneck.

def subRows(a: Array[Int], b: Array[Int], sizeHint: Int): Array[Int] = {
   val l: Array[Int] = new Array(sizeHint)
   var i = 0
   while (i < sizeHint) {
     l(i) = a(i) - b(i)
     i += 1
   }
   l
 }

I need to do this billions of times so any improvement in speed is a plus.

I have tried using a List instead of an Array to collect the differences and it is MUCH faster but I lose all benefit when I convert it back to an Array.

I did modify the downstream code to take a List to see if that would help but I need to access the contents of the list out of order so again there is loss of any gains there.

It seems like any conversion of one type to another is expensive and I am wondering if there is some way to use a map etc. that might be faster.

Is there a better way?


EDIT

Not sure what I did the first time!?

So the code I used to test it was this:

def subRowsArray(a: Array[Int], b: Array[Int], sizeHint: Int): Array[Int] = {
  val l: Array[Int] = new Array(sizeHint)
  var i = 0
  while (i < sizeHint) {
    l(i) = a(i) - b(i)
    i += 1
  }
  l
}

def subRowsList(a: Array[Int], b: Array[Int], sizeHint: Int): List[Int] = {
  var l: List[Int] = Nil
  var i = 0
  while (i < sizeHint) {
    l = a(i) - b(i) :: l
    i += 1
  }
  l
}

val a = Array.fill(100, 100)(scala.util.Random.nextInt(2))
val loops = 30000 * 10000

def runArray = for (i <- 1 to loops) subRowsArray(a(scala.util.Random.nextInt(100)), a(scala.util.Random.nextInt(100)), 100)

def runList = for (i <- 1 to loops) subRowsList(a(scala.util.Random.nextInt(100)), a(scala.util.Random.nextInt(100)), 100)

def optTimer(f: => Unit) = {
  val s = System.currentTimeMillis
  f
  System.currentTimeMillis - s
}

The results I thought I got the first time I did this were the exact opposite... I must have misread or mixed up the methods.

My apologies for asking a bad question.

like image 684
Mike Lavender Avatar asked Dec 18 '12 20:12

Mike Lavender


2 Answers

That code is the fastest you can manage single-threaded using a standard JVM. If you think List is faster, you're either fooling yourself or not actually telling us what you're doing. Putting an Int into List requires two object creations: one to create the list element, and one to box the integer. Object creations take about 10x longer than an array access. So it's really not a winning proposition to do it any other way.

If you really, really need to go faster, and must stay with a single thread, you should probably switch to C++ or the like and explicitly use SSE instructions. See this question, for example.

If you really, really need to go faster and can use multiple threads, then the easiest is to package up a chunk of work like this (i.e. a sensible number of pairs of vectors that need to be subtracted--probably at least a few million elements per chunk) into a list as long as the number of processors on your machine, and then call list.par.map(yourSubtractionRoutineThatActsOnTheChunkOfWork).

Finally, if you can be destructive,

a(i) -= b(i)

in the inner loop is, of course, faster. Likewise, if you can reuse space (e.g. with System.arraycopy), you're better off than if you have to keep allocating it. But that changes the interface from what you've shown.

like image 50
Rex Kerr Avatar answered Nov 15 '22 09:11

Rex Kerr


You can use Scalameter to try a benchmark the two implementations which requires at least JRE 7 update 4 and Scala 2.10 to be run. I used scala 2.10 RC2.

Compile with scalac -cp scalameter_2.10-0.2.jar RangeBenchmark.scala.

Run with scala -cp scalameter_2.10-0.2.jar:. RangeBenchmark.

Here's the code I used:

import org.scalameter.api._

object RangeBenchmark extends PerformanceTest.Microbenchmark {
  val limit = 100
  val a = new Array[Int](limit)
  val b = new Array[Int](limit)
  val array: Array[Int] = new Array(limit)
  var list: List[Int] = Nil
  val ranges = for {
    size <- Gen.single("size")(limit)
  } yield 0 until size

  measure method "subRowsArray" in {
    using(ranges) curve("Range") in {
      var i = 0
      while (i < limit) {
        array(i) = a(i) - b(i)
        i += 1
      }
      r => array
    }
  }

  measure method "subRowsList" in {
    using(ranges) curve("Range") in {
      var i = 0
      while (i < limit) {
        list = a(i) - b(i) :: list
        i += 1
      }
      r => list
    }
  }
}

Here's the results:

::Benchmark subRowsArray::
Parameters(size -> 100): 8.26E-4

::Benchmark subRowsList::
Parameters(size -> 100): 7.94E-4

You can draw your own conclusions. :)

The stack blew up on larger values of limit. I'll guess it's because it's measuring the performance many times.

like image 35
Brian Avatar answered Nov 15 '22 07:11

Brian