Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to get path to the uploaded file

I am running an spark cluster on google cloud and I upload a configuration file with each job. What is the path to a file that is uploaded with a submit command?

In the example below how can I read the file Configuration.properties before the SparkContext has been initialized? I am using Scala.

 gcloud dataproc jobs submit spark --cluster my-cluster --class MyJob  --files  config/Configuration.properties --jars my.jar  
like image 955
orestis Avatar asked Jan 16 '17 13:01

orestis


People also ask

How do you get the path of an uploaded file?

Click the Start button and then click Computer, click to open the location of the desired file, hold down the Shift key and right-click the file. Copy As Path: Click this option to paste the full file path into a document. Properties: Click this option to immediately view the full file path (location).

How do I find the path of an uploaded file in react?

You can retrieve the saved file path in the uploader success event and assign it to custom attribute (data-file-name) value of the respective file list element to open the uploaded file. Click the respective file element to create a new request along with saved file path using http header.

How do I get the full path of an uploaded file in Python?

To get current file's full path, you can use the os. path. abspath function. If you want only the directory path, you can call os.

What is an upload path?

Uploadpath allows the use of tokens (provided by the token module) to specify patterns of subfolders for uploaded file storage and hence depends on the token module. Tokens are small snippets of text in square brackets such as [nid] for node id.


1 Answers

Local path to a file distributed using SparkFiles mechanism (--files argument, SparkContext.addFile) method can be obtained using SparkFiles.get:

org.apache.spark.SparkFiles.get(fileName)

You can also get the path to the root directory using SparkFiles.getRootDirectory:

org.apache.spark.SparkFiles.getRootDirectory

You can use these combined with standard IO utilities to read the files.

how can I read the file Configuration.properties before the SparkContext has been initialized?

SparkFiles are distributed by the driver, cannot be accessed before context has been initialized, and to be distributed in the first place, have to be accessible from the driver node. So this part of the question solely depends what type of storage you'll use to expose the file to the driver node.

like image 183
zero323 Avatar answered Oct 11 '22 12:10

zero323