I have a RDD and I want to convert it to pandas dataframe. I know that to convert and RDD to a normal dataframe we can do
df = rdd1.toDF()
But I want to convert the RDD to pandas dataframe and not a normal dataframe. How can I do it?
Converting Spark RDD to DataFrame can be done using toDF(), createDataFrame() and transforming rdd[Row] to the data frame.
In PySpark, toDF() function of the RDD is used to convert RDD to DataFrame.
This method can take an RDD and create a DataFrame from it. The createDataFrame is an overloaded method, and we can call the method by passing the RDD alone or with a schema. We can observe the column names are following a default sequence of names based on a default template.
Import the pandas library and create a Pandas Dataframe using the DataFrame() method. Create a spark session by importing the SparkSession from the pyspark library. Pass the Pandas dataframe to the createDataFrame() method of the SparkSession object. Print the DataFrame.
You can use function toPandas():
Returns the contents of this DataFrame as Pandas pandas.DataFrame.
This is only available if Pandas is installed and available.
>>> df.toPandas()  
   age   name
0    2  Alice
1    5    Bob
                        You'll have to use a Spark DataFrame as an intermediary step between your RDD and the desired Pandas DataFrame.
For example, let's say I have a text file, flights.csv, that has been read in to an RDD:
flights = sc.textFile('flights.csv')
You can check the type:
type(flights)
<class 'pyspark.rdd.RDD'>
If you just use toPandas() on the RDD, it won't work.  Depending on the format of the objects in your RDD, some processing may be necessary to go to a Spark DataFrame first.  In the case of this example, this code does the job:
# RDD to Spark DataFrame
sparkDF = flights.map(lambda x: str(x)).map(lambda w: w.split(',')).toDF()
#Spark DataFrame to Pandas DataFrame
pdsDF = sparkDF.toPandas()
You can check the type:
type(pdsDF)
<class 'pandas.core.frame.DataFrame'>
                        If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With