Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Spark Standalone Mode multiple shell sessions (applications)

Tags:

apache-spark

In Spark 1.0.0 Standalone mode with multiple worker nodes, I'm trying to run a Spark shell from two different computers (same Linux user).

In the documentation, it says "By default, applications submitted to the standalone mode cluster will run in FIFO (first-in-first-out) order, and each application will try to use all available nodes."

The number of cores per worker is set to 4 with 8 being available (via SPARK_JAVA_OPTS="-Dspark.cores.max=4"). Memory is also limited such that enough should be available for both.

However, when looking at the Spark Master WebUI, the shell application that was started later will always remain in state "WAITING" until the first one is exited. The number of cores assigned to it is 0, the Memory per node 10G (same as the one that is already running)

Is there a way to have both shells running at the same time without using Mesos?

like image 591
helm Avatar asked Jun 23 '14 18:06

helm


2 Answers

Before a shell will start processing on a spark standalone cluster, there has to be sufficient cores and memory. You must specify from each spark shell the number of cores you want, or it will use them all. If you specify 5 cores, with executor memory=10G (the amount of memory you allocated for the executors), and the second spark shell to run with 2 cores, and 10G of memory, the second one will still not start, because the first shell is using both executors, and is using all of the memory on both. If you specify 5G of executor memory for each spark shell, then they can concurrently run.

Essentially you want to have multiple jobs running on a standalone cluster -- unfortunately, it is really not designed to handle this case well. If you want to do that you should use either mesos or yarn.

like image 184
David Avatar answered Oct 13 '22 13:10

David


One workaround to this is to restrict the number of cores per spark shell using total-executor-cores. For example to restrict it to 16 cores, launch it like this:

bin/spark-shell --total-executor-cores 16 --master spark://$MASTER:7077

In this case each shell will use only 16 cores, so you can have two shells running on your 32 cores cluster. They can then run simultaneously but never use more than 16 cores each :(

This solution is far from ideal, I know. You depend on users to restrict themselves, to shut down their shells, and resources are wasted when a user is not running code. I have created a request to fix this on JIRA, which you can vote for.

like image 34
Tobber Avatar answered Oct 13 '22 15:10

Tobber