dataframe.saveasTextFile
, saves only the data in a delimited format. How do I save the dataframe with headers in JAVA.
sourceRufFrame.toJavaRDD().map(new TildaDelimiter()).coalesce(1, true).saveAsTextFile(targetSrcFilePath);
In order to write DataFrame to CSV with a header, you should use option(), Spark CSV data-source provides several options which we will see in the next section. I have 3 partitions on DataFrame hence it created 3 part files when you save it to the file system.
DataFrames can also be saved as persistent tables into Hive metastore using the saveAsTable command. Notice that an existing Hive deployment is not necessary to use this feature. Spark will create a default local Hive metastore (using Derby) for you.
If you want to save as csv file, i would suggest using spark-csv
package. You can save your dataframe simply with spark-csv
as below with header.
dataFrame.write
.format("com.databricks.spark.csv")
.option("header", "true")
.option("delimiter",<your delimiter>)
.save(output)
You can refer below link, for further information: https://github.com/databricks/spark-csv
Spark-csv
has maven dependency.
With Spark 2.x,
df.write.option("header", "true").csv("path")
Cheers
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With