Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to convert DataFrame to RDD in Scala?

Can someone please share how one can convert a dataframe to an RDD?

like image 983
Vajra Avatar asked Sep 11 '15 19:09

Vajra


People also ask

Can I convert DataFrame to RDD?

rdd is used to convert PySpark DataFrame to RDD; there are several transformations that are not available in DataFrame but present in RDD hence you often required to convert PySpark DataFrame to RDD. Since PySpark 1.3, it provides a property .

Can we create RDD from DataFrame in Spark?

The SparkSession object has a utility method for creating a DataFrame – createDataFrame. This method can take an RDD and create a DataFrame from it. The createDataFrame is an overloaded method, and we can call the method by passing the RDD alone or with a schema.

Which of the following method is utilized to convert RDD to Dataframes?

Converting Spark RDD to DataFrame can be done using toDF(), createDataFrame() and transforming rdd[Row] to the data frame.

How do I convert a Row to a DataFrame in Spark Scala?

In case of one row, you can run: val dfFromArray = sparkContext. parallelize(Seq(row)). map(row => (row.


1 Answers

Simply:

val rows: RDD[Row] = df.rdd 
like image 182
Jean Logeart Avatar answered Sep 19 '22 19:09

Jean Logeart