Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Fastest way to take elementwise sum of two Lists

I can do elementwise operation like sum using Zipped function. Let I have two Lists L1 and L2 as shown below

val L1 = List(1,2,3,4)
val L2 = List(5,6,7,8)

I can take element wise sum in following way

(L1,L2).zipped.map(_+_)

and result is

List(6, 8, 10, 12) 

as expected.

I am using Zipped function in my actual code but it takes too much time. In reality My List Size is more than 1000 and I have more than 1000 Lists and my algorithm is iterative where iterations could be up to one billion.

In code I have to do following stuff

list =( (L1,L2).zipped.map(_+_).map (_  * math.random) , L3).zipped.map(_+_)

size of L1,L2 and L3 is same. Moreover I have to execute my actual code on a cluster.

What is the fastest way to take elementwise sum of Lists in Scala?

like image 233
Asif Avatar asked Dec 01 '19 13:12

Asif


People also ask

How do you sum two lists?

The sum() function is used to add two lists using the index number of the list elements grouped by the zip() function. A zip() function is used in the sum() function to group list elements using index-wise lists. Let's consider a program to add the list elements using the zip function and sum function in Python.

How do you sum two list elements in Python?

map() can also be used, as we can input the add operation to the map() along with the two list and map() can perform the addition of both the techniques. This can be extended to any mathematical operation possible. sum() can perform the index-wise addition of the list that can be “zipped” together using the zip() .

How do you sum two arrays in Python?

To add the two arrays together, we will use the numpy. add(arr1,arr2) method. In order to use this method, you have to make sure that the two arrays have the same length. If the lengths of the two arrays are​ not the same, then broadcast the size of the shorter array by adding zero's at extra indexes.

How do you sum every other number in a list Python?

You can use sum(mylist[1::2]) to add every odd items. Note: if you are dealing with huge lists or you are already memory constrained you can use itertools. islice instead of plain slicing: sum(islice(mylist, 0, len(mylist), 2)) .


1 Answers

One option would be to use a Streaming implementation, taking advantage of the lazyness may increase the performance.

An example using LazyList (introduced in Scala 2.13).

def usingLazyList(l1: LazyList[Double], l2: LazyList[Double], l3: LazyList[Double]): LazyList[Double] =
  ((l1 zip l2) zip l3).map {
    case ((a, b), c) =>
      ((a + b) * math.random()) + c
  }

And an example using fs2.Stream (introduced by the fs2 library).

import fs2.Stream
import cats.effect.IO

def usingFs2Stream(s1: Stream[IO, Double], s2: Stream[IO, Double], s3: Stream[IO, Double]): Stream[IO, Double] =
  s1.zipWith(s2) {
    case (a, b) =>
      (a + b) * math.random()
  }.zipWith(s3) {
    case (acc, c) =>
      acc + c
  }

However, if those are still too slow, the best alternative would be to use plain arrays.

Here is an example using ArraySeq (introduced in Scala 2.13 too) which at least will preserve immutability. You may use raw arrays if you prefer but take care.
(if you want, you may also use the collections-parallel module to be even more performant)

import scala.collection.immutable.ArraySeq
import scala.collection.parallel.CollectionConverters._

def usingArraySeq(a1: ArraySeq[Double], a2: ArraySeq[Double], a3: ArraySeq[Double]): ArraySeq[Double] = {
  val length = a1.length

  val arr = Array.ofDim[Double](length)

  (0 until length).par.foreach { i =>
    arr(i) = ((a1(i) + a2(i)) * math.random()) + a3(i)
  }

  ArraySeq.unsafeWrapArray(arr)
}
like image 81
Luis Miguel Mejía Suárez Avatar answered Sep 18 '22 14:09

Luis Miguel Mejía Suárez