In a PySpark application, I tried to transpose a dataframe by transforming it into pandas and then I want to write the result in csv file. This is how I am doing it:
df = df.toPandas().set_index("s").transpose()
df.coalesce(1).write.option("header", True).option("delimiter", ",").csv('dataframe')
When execution this script I get the following error:
'DataFrame' object has no attribute 'coalesce'
What is the problem? How can I fix it?
The problem is that you converted the spark dataframe into a pandas dataframe. A pandas dataframe do not have a coalesce
method. You can see the documentation for pandas here.
When you use toPandas()
the dataframe is already collected and in memory,
try to use the pandas dataframe method df.to_csv(path)
instead.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With