Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Saving dataframe records in a tab delimited file

How can I save records of a DataFrame into a tab delimited output file? The DataFame looks like below:

>>> csvDf.show(2,False)

1. |1  |Eldon Base for stackable storage shelf, platinum  |Muhammed
MacIntyre|3  |-213.25|38.94 |35   |Nunavut|Storage & Organization   
|0.8 | 
2. |2  |1.7 Cubic Foot Compact "Cube" Office Refrigerators|Barry
French      |293|457.81 |208.16|68.02|Nunavut|Appliances            
|0.58|
like image 386
Suraj Avatar asked Dec 12 '17 19:12

Suraj


3 Answers

Just pass delimiter option to the writer:

csvDf.write.option("delimiter", "\t").csv(output_path)

In Spark 1.6 use spark-csv package (check README for detailed instructions) with the same option:

csvDf.write.option("delimiter", "\t").format("com.databricks.spark.csv").save(output_path)
like image 187
Alper t. Turker Avatar answered Nov 15 '22 06:11

Alper t. Turker


In Spark 2.4.3 it is:

csvDf
.write
.option("sep", "\t")
.option("encoding", "UTF-8")
.csv(targetFilePath)
like image 28
dripp Avatar answered Nov 15 '22 05:11

dripp


this worked for me ...

csvDf.rdd.map(lambda x: '\t'.join(x)).coalesce(1).saveAsTextFile('/output/csv/6.csv')

like image 42
Suraj Avatar answered Nov 15 '22 04:11

Suraj