Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Finding the max value in Spark RDD

From the following, how can I get the tuple with the highest value?

Array[(String, Int)] = Array((a,30),(b,50),(c,20))

In this example the result I want would be (b,50)

like image 391
blankface Avatar asked Dec 06 '25 17:12

blankface


1 Answers

You could use reduce():

val max_tuple = rdd.reduce((acc,value) => { 
  if(acc._2 < value._2) value else acc})
//max_tuple: (String, Int) = (b,50)

Data

val rdd = sc.parallelize(Array(("a",30),("b",50),("c",20)))
like image 132
mtoto Avatar answered Dec 09 '25 18:12

mtoto



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!