Spark SQL - How to write DataFrame to text file?

Question

I am using Spark SQL for reading parquet and writing parquet file.

But some cases,i need to write the DataFrame as text file instead of Json or Parquet.

Is there any default methods supported or i have to convert that DataFrame to RDD then use saveAsTextFile() method?

Igorock · Accepted Answer

df.repartition(1).write.option("header", "true").csv("filename.csv")

Radu Ionescu · Answer

Using Databricks Spark-CSV you can save directly to a CSV file and load from a CSV file afterwards like this

import org.apache.spark.sql.SQLContext

SQLContext sqlContext = new SQLContext(sc);
DataFrame df = sqlContext.read()
    .format("com.databricks.spark.csv")
    .option("inferSchema", "true")
    .option("header", "true")
    .load("cars.csv");

df.select("year", "model").write()
    .format("com.databricks.spark.csv")
    .option("header", "true")
    .option("codec", "org.apache.hadoop.io.compress.GzipCodec")
    .save("newcars.csv");

Spark SQL - How to write DataFrame to text file?

Tags:

java

apache-spark-sql

Shankar

2 Answers

Igorock

Radu Ionescu

Recent Activity

Donate For Us

Spark SQL - How to write DataFrame to text file?

Tags:

java

apache-spark-sql

Shankar

2 Answers

Igorock

Radu Ionescu

Related questions

Recent Activity

Donate For Us