Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Spark: saveAsTextFile() only creating SUCCESS file and no part file when writing to local filesystem

I am writing an RDD to a file using below command:

rdd.coalesce(1).saveAsTextFile(FilePath)

When the FilePath is HDFS path (hdfs://node:9000/folder/) everything works fine.

When the FilePath is local path (file:///home/user/folder/) everything seems to work. The output folder is created and SUCCESS file is also present.

However I do not see any part-00000 file containing the output. There is no other file. There is no error in the spark console output either.

I also tried calling collect on the RDD before calling saveAsTextFile(), giving 777 permission to output folder but nothing is working.

Please help.

like image 722
Nikhil Utane Avatar asked Jun 14 '17 05:06

Nikhil Utane


People also ask

How do you write a DataFrame to a local file system?

1. Write a Single file using Spark coalesce() & repartition() When you are ready to write a DataFrame, first use Spark repartition() and coalesce() to merge data from all partitions into a single partition and then save it to a file.

Does Apache Spark support local file system?

Spark can create distributed datasets from any storage source supported by Hadoop, including your local file system, HDFS, Cassandra, HBase, Amazon S3, etc. Spark supports text files, SequenceFiles, and any other Hadoop InputFormat.

How do I save my Spark output?

Saving the text files: Spark consists of a function called saveAsTextFile(), which saves the path of a file and writes the content of the RDD to that file. The path is considered as a directory, and multiple outputs will be produced in that directory.

What is CRC file in Spark?

The question is why we need CRC and _SUCCESS files? Spark (worker) nodes write data simultaneously and these files act as checksum for validation. Writing to a single file takes away the idea of distributed computing and this approach may fail if your resultant file is too large.


1 Answers

save to local make effects only when using local master

like image 71
dpwang Avatar answered Oct 16 '22 15:10

dpwang