How can I export Spark's DataFrame to csv file using Scala?

Easiest and best way to do this is to use <code>spark-csv</code> library. You can check the documentation in the provided link and <code>here</code> is the scala example of how to load and save data from/to DataFrame. Code (Spark 1.4+): <pre class="prettyprint"><code>dataFrame.write.format("com.databricks.spark.csv").save("myFile.csv") </code></pre> Edit: Spark creates part-files while saving the csv data, if you want to merge the part-files into a single csv, refer the following: Merge Spark's CSV output folder to Single File

Above solution exports csv as multiple partitions. I found another solution by zero323 on this stackoverflow page that exports a dataframe into one single CSV file when you use <code>coalesce</code>. <pre class="prettyprint"><code>df.coalesce(1) .write.format("com.databricks.spark.csv") .option("header", "true") .save("/your/location/mydata") </code></pre> This would create a directory named <code>mydata</code> where you'll find a <code>csv</code> file that contains the results.

How to export DataFrame to csv in Scala?

3 Answers

In Spark verions 2+ you can simply use the following;

Click to copy

df.write.csv("/your/location/data.csv")

If you want to make sure that the files are no longer partitioned then add a .coalesce(1) as follows;

Click to copy

df.coalesce(1).write.csv("/your/location/data.csv")

answered Oct 24 '22 09:10

Taylrl

Easiest and best way to do this is to use spark-csv library. You can check the documentation in the provided link and here is the scala example of how to load and save data from/to DataFrame.

Code (Spark 1.4+):

Click to copy

dataFrame.write.format("com.databricks.spark.csv").save("myFile.csv")

Edit:

Spark creates part-files while saving the csv data, if you want to merge the part-files into a single csv, refer the following:

Merge Spark's CSV output folder to Single File

answered Oct 24 '22 09:10

karthik manchala

Above solution exports csv as multiple partitions. I found another solution by zero323 on this stackoverflow page that exports a dataframe into one single CSV file when you use coalesce.

Click to copy

df.coalesce(1)
  .write.format("com.databricks.spark.csv")
  .option("header", "true")
  .save("/your/location/mydata")

This would create a directory named mydata where you'll find a csv file that contains the results.

answered Oct 24 '22 07:10

Abu Shoeb

Related questions
                            
                                Is Option wrapping a value a good pattern?
                            
                                Scala: Method overloading over generic types
                            
                                Scala - "if(true) Some(1)" without having to type "else None"
                            
                                Scala warnings, IntelliJ and compiler flags
                            
                                How to convert any a number to a long
                            
                                Scala equivalent of Ruby's map.each?
                            
                                How to run Scala code in Intellij Idea 10
                            
                                Clean up Play-framework based project
                            
                                Implicit conversion of java.util.List to scala List does not occur
                            
                                filter DataFrame with Regex with Spark in Scala
                            
                                Why in Scala Long cannot in initialized to null whear as Integer can
                            
                                Pattern matching a String as Seq[Char]
                            
                                Can I limit the size of an array in Scala?
                            
                                Export Scala application to runnable JAR
                            
                                Scala List.filter with two conditions, applied only once
                            
                                Which is better framework Java/GWT or Scala/Lift?
                            
                                Guice And Scala - Injection on Generics Dependencies
                            
                                Composing a list of all pairs
                            
                                Getting java applications to look native on windows - how?
                            
                                Wildcard imports usage in Java and Scala

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

How to export DataFrame to csv in Scala?

Tags:

csv

scala

apache-spark

Tong

People also ask

3 Answers

Taylrl

karthik manchala

Abu Shoeb

Recent Activity

Donate For Us