Is it possible to submit and run Spark jobs concurrently in the same AWS EMR cluster ? If yes then could you please elaborate ?
You should use the tag --deploy-mode cluster
that will allow you to deploy multiple executions to your cluster. That will make yarn handle the resources and the queues for you.
The full example:
spark-submit \
--class org.apache.spark.examples.SparkPi \
--master yarn \
--deploy-mode cluster \ # can be client for client mode
--executor-memory 20G \
--num-executors 50 \
/path/to/examples.jar \
1000
More details here.
Currently, EMR doesn't support running multiple steps in parallel. As far as I know such experimental feature is already implemented but not released due to some issues.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With