I'm trying to use spark-submit
to execute my python code in spark cluster.
Generally we run spark-submit
with python code like below.
# Run a Python application on a cluster ./bin/spark-submit \ --master spark://207.184.161.138:7077 \ my_python_code.py \ 1000
But I wanna run my_python_code.py
by passing several arguments Is there smart way to pass arguments?
Once you do a Spark submit, a driver program is launched and this requests for resources to the cluster manager and at the same time the main program of the user function of the user processing program is initiated by the driver program.
Making this more systematic: Put the code below in a script (e.g. spark-script.sh ), and then you can simply use: ./spark-script.sh your_file. scala first_arg second_arg third_arg , and have an Array[String] called args with your arguments.
Spark translates the RDD transformations into something called DAG (Directed Acyclic Graph) and starts the execution, At high level, when any action is called on the RDD, Spark creates the DAG and submits to the DAG scheduler.
Spark Submit Python File Apache Spark binary comes with spark-submit.sh script file for Linux, Mac, and spark-submit. cmd command file for windows, these scripts are available at $SPARK_HOME/bin directory which is used to submit the PySpark file with . py extension (Spark with python) to the cluster.
Even though sys.argv
is a good solution, I still prefer this more proper way of handling line command args in my PySpark jobs:
import argparse parser = argparse.ArgumentParser() parser.add_argument("--ngrams", help="some useful description.") args = parser.parse_args() if args.ngrams: ngrams = args.ngrams
This way, you can launch your job as follows:
spark-submit job.py --ngrams 3
More information about argparse
module can be found in Argparse Tutorial
Yes: Put this in a file called args.py
#import sys print sys.argv
If you run
spark-submit args.py a b c d e
You will see:
['/spark/args.py', 'a', 'b', 'c', 'd', 'e']
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With