Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

spark error RDD type not found when creating RDD

I am trying to create an RDD of case class objects. Eg.,

// sqlContext from the previous example is used in this example.
// createSchemaRDD is used to implicitly convert an RDD to a SchemaRDD.
import sqlContext.createSchemaRDD

val people: RDD[Person] = ... // An RDD of case class objects, from the previous example.

// The RDD is implicitly converted to a SchemaRDD by createSchemaRDD, allowing it to be stored using        Parquet.
people.saveAsParquetFile("people.parquet")

I am trying to complete the part from the previous example by giving

    case class Person(name: String, age: Int)

    // Create an RDD of Person objects and register it as a table.
    val people: RDD[Person] = sc.textFile("/user/root/people.txt").map(_.split(",")).map(p => Person(p(0), p(1).trim.toInt))
    people.registerAsTable("people")

I get the following error:

<console>:28: error: not found: type RDD
       val people: RDD[Person] =sc.textFile("/user/root/people.txt").map(_.split(",")).map(p => Person(p(0), p(1).trim.toInt))

Any idea on what went wrong? Thanks in advance!

like image 752
user1189851 Avatar asked Oct 29 '14 15:10

user1189851


1 Answers

The issue here is the explicit RDD[String] type annotation. It looks like RDD isn't imported by default in spark-shell, which is why Scala is complaining that it can't find the RDD type. Try running import org.apache.spark.rdd.RDD first.

like image 104
Josh Rosen Avatar answered Oct 09 '22 07:10

Josh Rosen