I am trying to create an RDD of case class objects. Eg.,
// sqlContext from the previous example is used in this example.
// createSchemaRDD is used to implicitly convert an RDD to a SchemaRDD.
import sqlContext.createSchemaRDD
val people: RDD[Person] = ... // An RDD of case class objects, from the previous example.
// The RDD is implicitly converted to a SchemaRDD by createSchemaRDD, allowing it to be stored using Parquet.
people.saveAsParquetFile("people.parquet")
I am trying to complete the part from the previous example by giving
case class Person(name: String, age: Int)
// Create an RDD of Person objects and register it as a table.
val people: RDD[Person] = sc.textFile("/user/root/people.txt").map(_.split(",")).map(p => Person(p(0), p(1).trim.toInt))
people.registerAsTable("people")
I get the following error:
<console>:28: error: not found: type RDD
val people: RDD[Person] =sc.textFile("/user/root/people.txt").map(_.split(",")).map(p => Person(p(0), p(1).trim.toInt))
Any idea on what went wrong? Thanks in advance!
The issue here is the explicit RDD[String]
type annotation. It looks like RDD
isn't imported by default in spark-shell
, which is why Scala is complaining that it can't find the RDD
type. Try running import org.apache.spark.rdd.RDD
first.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With