Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

PySpark reduceByKey on multiple values

Tags:

pyspark

If I have a K,V pair that is like:

(K, (v1, v2))
(K, (v3, v4))

How can I sum up the values such that I get (k, (v1 + v3, v2 + v4)) ?

like image 480
KillerSnail Avatar asked Oct 28 '25 09:10

KillerSnail


1 Answers

reduceByKey supports functions. Lets say A is the array of the Key-Value pairs.

output = A.reduceByKey(lambda x, y: x[0]+y[0], x[1]+y[1])
like image 84
Lokesh A. R. Avatar answered Oct 31 '25 13:10

Lokesh A. R.



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!