Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to access local files in Spark on Windows?

I am using Spark on Windows. I know in *nix, accessing local file code likes this:

val textFile = sc.textFile("file:///usr/local/spark/README.md") 

But how can I access a local file on Windows? I have tried following methods:

val logFile = "C:\spark-1.3.1-bin-hadoop2.4\README.md"
val logFile = "file\\C:\spark-1.3.1-bin-hadoop2.4\README.md"

But all can't work.

like image 470
Nan Xiao Avatar asked May 29 '15 02:05

Nan Xiao


People also ask

How do I access local files in Spark?

To access the file in Spark jobs, use SparkFiles. get(fileName) to find its download location. A directory can be given if the recursive option is set to true. Currently directories are only supported for Hadoop-supported filesystems.

Can Spark read from local file system?

Spark can create distributed datasets from any storage source supported by Hadoop, including your local file system, HDFS, Cassandra, HBase, Amazon S3, etc. Spark supports text files, SequenceFiles, and any other Hadoop InputFormat.


1 Answers

Unfortunately in windows you have to escape "\".

Try:

"C:\\spark-1.3.1-bin-hadoop2.4\\README.md"
like image 198
ayan guha Avatar answered Sep 21 '22 16:09

ayan guha