I can do elementwise operation like sum using Zipped function. Let I have two Lists L1 and L2 as shown below
val L1 = List(1,2,3,4)
val L2 = List(5,6,7,8)
I can take element wise sum in following way
(L1,L2).zipped.map(_+_)
and result is
List(6, 8, 10, 12)
as expected.
I am using Zipped function in my actual code but it takes too much time. In reality My List Size is more than 1000 and I have more than 1000 Lists and my algorithm is iterative where iterations could be up to one billion.
In code I have to do following stuff
list =( (L1,L2).zipped.map(_+_).map (_ * math.random) , L3).zipped.map(_+_)
size of L1,L2 and L3 is same. Moreover I have to execute my actual code on a cluster.
What is the fastest way to take elementwise sum of Lists in Scala?
The sum() function is used to add two lists using the index number of the list elements grouped by the zip() function. A zip() function is used in the sum() function to group list elements using index-wise lists. Let's consider a program to add the list elements using the zip function and sum function in Python.
map() can also be used, as we can input the add operation to the map() along with the two list and map() can perform the addition of both the techniques. This can be extended to any mathematical operation possible. sum() can perform the index-wise addition of the list that can be “zipped” together using the zip() .
To add the two arrays together, we will use the numpy. add(arr1,arr2) method. In order to use this method, you have to make sure that the two arrays have the same length. If the lengths of the two arrays are not the same, then broadcast the size of the shorter array by adding zero's at extra indexes.
You can use sum(mylist[1::2]) to add every odd items. Note: if you are dealing with huge lists or you are already memory constrained you can use itertools. islice instead of plain slicing: sum(islice(mylist, 0, len(mylist), 2)) .
One option would be to use a Streaming implementation, taking advantage of the lazyness may increase the performance.
An example using LazyList (introduced in Scala 2.13
).
def usingLazyList(l1: LazyList[Double], l2: LazyList[Double], l3: LazyList[Double]): LazyList[Double] =
((l1 zip l2) zip l3).map {
case ((a, b), c) =>
((a + b) * math.random()) + c
}
And an example using fs2.Stream (introduced by the fs2
library).
import fs2.Stream
import cats.effect.IO
def usingFs2Stream(s1: Stream[IO, Double], s2: Stream[IO, Double], s3: Stream[IO, Double]): Stream[IO, Double] =
s1.zipWith(s2) {
case (a, b) =>
(a + b) * math.random()
}.zipWith(s3) {
case (acc, c) =>
acc + c
}
However, if those are still too slow, the best alternative would be to use plain arrays.
Here is an example using ArraySeq (introduced in Scala 2.13
too) which at least will preserve immutability. You may use raw arrays if you prefer but take care.
(if you want, you may also use the collections-parallel module
to be even more performant)
import scala.collection.immutable.ArraySeq
import scala.collection.parallel.CollectionConverters._
def usingArraySeq(a1: ArraySeq[Double], a2: ArraySeq[Double], a3: ArraySeq[Double]): ArraySeq[Double] = {
val length = a1.length
val arr = Array.ofDim[Double](length)
(0 until length).par.foreach { i =>
arr(i) = ((a1(i) + a2(i)) * math.random()) + a3(i)
}
ArraySeq.unsafeWrapArray(arr)
}
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With