Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why inconsistent results using subtraction in reduce?

Given the following:

val rdd = List(1,2,3)

I assumed that rdd.reduce((x,y) => (x - y)) would return -4 (i.e. (1-2)-3=-4), but it returned 2.

Why?

like image 405
HappyCane Avatar asked Dec 24 '22 06:12

HappyCane


2 Answers

From the RDD source code (and docs):

/**
* Reduces the elements of this RDD using the specified commutative and
* associative binary operator.
*/
def reduce(f: (T, T) => T): T

reduce is a monoidal reduction, thus it assumes the function is commutative and associative, meaning that the order of applying it to the elements is not guaranteed.

Obviously, your function (x,y)=>(x-y) isn't commutative nor associative.

In your case, the reduce might have been applied this way:

3 - (2 - 1) = 2

or

1 - (2 - 3) = 2
like image 180
Tzach Zohar Avatar answered Jan 11 '23 15:01

Tzach Zohar


You can easy replace subtraction v1 - v2 - ... - vN with v1 - (v2 + ... + vN), so your code can look like

val v1 = 1
val values = Seq(2, 3)
val sum = sc.paralellize(values).reduce(_ + _)
val result = v1 - sum
like image 35
Vitalii Kotliarenko Avatar answered Jan 11 '23 13:01

Vitalii Kotliarenko