Given the following:
val rdd = List(1,2,3)
I assumed that rdd.reduce((x,y) => (x - y))
would return -4
(i.e. (1-2)-3=-4
), but it returned 2
.
Why?
From the RDD source code (and docs):
/**
* Reduces the elements of this RDD using the specified commutative and
* associative binary operator.
*/
def reduce(f: (T, T) => T): T
reduce
is a monoidal reduction, thus it assumes the function is commutative and associative, meaning that the order of applying it to the elements is not guaranteed.
Obviously, your function (x,y)=>(x-y)
isn't commutative nor associative.
In your case, the reduce might have been applied this way:
3 - (2 - 1) = 2
or
1 - (2 - 3) = 2
You can easy replace subtraction v1 - v2 - ... - vN with v1 - (v2 + ... + vN), so your code can look like
val v1 = 1
val values = Seq(2, 3)
val sum = sc.paralellize(values).reduce(_ + _)
val result = v1 - sum
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With