Here is code I'm trying out for reduceByKey :
import org.apache.spark.rdd.RDD
import org.apache.spark.SparkContext._
import org.apache.spark.SparkContext
import scala.math.random
import org.apache.spark._
import org.apache.spark.storage.StorageLevel
object MapReduce {
def main(args: Array[String]) {
val sc = new SparkContext("local[4]" , "")
val file = sc.textFile("c:/data-files/myfile.txt")
val counts = file.flatMap(line => line.split(" "))
.map(word => (word, 1))
.reduceByKey(_ + _)
}
}
Is giving compiler error : "cannot resolve symbol reduceByKey"
When I hover over implementation of reduceByKey it gives three possible implementations so it appears it is being found ?:
You need to add the following import to your file:
import org.apache.spark.SparkContext._
Spark documentation:
"In Scala, these operations are automatically available on RDDs containing Tuple2 objects (the built-in tuples in the language, created by simply writing (a, b)), as long as you import org.apache.spark.SparkContext._ in your program to enable Spark’s implicit conversions. The key-value pair operations are available in the PairRDDFunctions class, which automatically wraps around an RDD of tuples if you import the conversions."
It seems as if the documented behavior has changed in Spark 1.4.x. To have IntelliJ recognize the implicit conversions you now have to add the following import:
import org.apache.spark.rdd.RDD._
I have noticed that at times IJ is unable to resolve methods that are imported implicitly via PairRDDFunctions https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/rdd/PairRDDFunctions.scala .
The methods implicitly imported include the reduceByKey* and reduceByKeyAndWindow* methods. I do not have a general solution at this time -except that yes you can safely ignore the intellisense errors
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With