I have an RDD of type:
dataset :org.apache.spark.rdd.RDD[(String, Double)] = MapPartitionRDD[26]
Which is equivalent to (Pedro, 0.0833), (Hello, 0.001828) ...
I'd like to sum all the value , 0.0833+0.001828..
but I can't find a proper
solution.
Considering your input data, you can do the following :
// example
val datasets = sc.parallelize(List(("Pedro", 0.0833), ("Hello", 0.001828)))
datasets.map(_._2).sum()
// res3: Double = 0.085128
// or
datasets.map(_._2).reduce(_ + _)
// res4: Double = 0.085128
// or even
datasets.values.sum()
// res5: Double = 0.085128
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With