Spark write parquet not writing any files, only _SUCCESS

Question

The application includes

val stats = sqlContext.sql("select id, n from myTable")

stats.write.parquet("myTable.parquet")

This creates dir myTable.parquet with no contents other than an empty _SUCCESS file, even that

stats.show  // illustration only here, original size motivates parquet use

+-----+----+
|  id |  n |
+-----+----+
|   a |  1 |
|   b |  2 |
+-----+----+

stats.printSchema 

root
 |-- id: string (nullable = true)
 |-- n: long (nullable = true)

How to make write.parquet to write the actual contents of the dataframe ? What is missing ?

Note This occurs also with saveAsTextFile.

ostrokach · Accepted Answer

In my case, this was happening when I was trying to save a file to my local filesystem instead of the file system that is accessible from the Spark cluster.

The file is written by the Spark worker nodes, not by the PySpark client, and so it should be output to a filesystem that is accessible both by the worker nodes and the client.

Spark write parquet not writing any files, only _SUCCESS

Tags:

echo

1 Answers

ostrokach

Recent Activity

Donate For Us

Spark write parquet not writing any files, only _SUCCESS

Tags:

echo

1 Answers

ostrokach

Related questions

Recent Activity

Donate For Us