How can I export Spark's DataFrame to csv file using Scala?
Easiest and best way to do this is to use spark-csv library. You can check the documentation in the provided link and here is the scala example of how to load and save data from/to DataFrame. Show activity on this post.
In Spark, you can save (write/extract) a DataFrame to a CSV file on disk by using dataframeObj. write. csv("path") , using this you can also write DataFrame to AWS S3, Azure Blob, HDFS, or any Spark supported file systems.
For writing the CSV file, we'll use Scala's BufferedWriter, FileWriter and csvWriter. We need to import all the above files before moving forward to deciding a path and giving column headings to our file. We take a few rows of our data to take as input for the training dataset and to use it in writing our CSV file.
In Spark verions 2+ you can simply use the following;
df.write.csv("/your/location/data.csv")
If you want to make sure that the files are no longer partitioned then add a .coalesce(1)
as follows;
df.coalesce(1).write.csv("/your/location/data.csv")
Easiest and best way to do this is to use spark-csv
library. You can check the documentation in the provided link and here
is the scala example of how to load and save data from/to DataFrame.
Code (Spark 1.4+):
dataFrame.write.format("com.databricks.spark.csv").save("myFile.csv")
Edit:
Spark creates part-files while saving the csv data, if you want to merge the part-files into a single csv, refer the following:
Merge Spark's CSV output folder to Single File
Above solution exports csv as multiple partitions. I found another solution by zero323 on this stackoverflow page that exports a dataframe into one single CSV file when you use coalesce
.
df.coalesce(1)
.write.format("com.databricks.spark.csv")
.option("header", "true")
.save("/your/location/mydata")
This would create a directory named mydata
where you'll find a csv
file that contains the results.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With