Spark provide method saveAsTextFile
which can store RDD[T]
into disk or hdfs easily.
T is an arbitrary serializable class.
I want to reverse the operation.
I wonder whether there is a loadFromTextFile
which can easily load a file into RDD[T]
?
Let me make it clear:
class A extends Serializable {
...
}
val path:String = "hdfs..."
val d1:RDD[A] = create_A
d1.saveAsTextFile(path)
val d2:RDD[A] = a_load_function(path) // this is the function I want
//d2 should be the same as d1
Try to use d1.saveAsObjectFile(path)
to store and val d2 = sc.objectFile[A](path)
to load.
I think you cannot saveAsTextFile
and read it out as RDD[A]
without transformation from RDD[String]
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With