I want to launch some pyspark jobs on YARN
. I have 2 nodes, with 10 GB each. I am able to open up the pyspark shell like so: pyspark
Now when I have a very simple example that I try to launch:
import random
NUM_SAMPLES=1000
def inside(p):
x, y = random.random(), random.random()
return x*x + y*y < 1
count = sc.parallelize(xrange(0, NUM_SAMPLES)) \
.filter(inside).count()
print "Pi is roughly %f" % (4.0 * count / NUM_SAMPLES)
I get as a result a very long spark log with the error output. The most important information is:
ERROR cluster.YarnScheduler: Lost executor 1 on (ip>: Container marked as failed: <containerID> on host: <ip>. Exit status 1. Diagnostics: Exception from container-launch. ......
later on in the logs I see...
ERROR scheduler.TaskSetManager: Task 0 in stage 0.0 failed 1 times: aborting job
INFO cluster.YarnClientSchedulerBackend: Asked to remove non-existent executor 1
INFO spark.ExecutorAllocationManager: Existing executor 1 has been removed (new total is 0)
From what I'm gathering from the logs above, this seems to be a container sizing issue in yarn.
My yarn-site.xml
file has the following settings:
yarn.scheduler.maximum-allocation-mb = 10240
yarn.nodemanager.resource.memory-mb = 10240
and in spark-defaults.conf
contains:
spark.yarn.executor.memoryOverhead=2048
spark.driver.memory=3g
If there are any other settings you'd like to know about, please let me know.
How do I set the container size in yarn appropriately?
(bounty on the way for someone who can help me with this)
Let me first explain the basic set of properties required to tune your spark application on a YARN cluster.
Note: Container in YARN is equivalent to Executor in Spark. For understandability, you can consider that both are same.
On yarn-site.xml:
yarn.nodemanager.resource.memory-mb
is the total memory available to the cluster from a given node.
yarn.nodemanager.resource.cpu-vcores
is the total number of CPU vcores available to the cluster from a given node.
yarn.scheduler.maximum-allocation-mb
is the maximum memory in mb that can be allocated per yarn container.
yarn.scheduler.maximum-allocation-vcores
is the maximum number of vcores that can be allocated per yarn container.
Example: If a node has 16GB and 8vcores and you would like to contribute 14GB and 6vcores to the cluster(for containers), then set properties as shown below:
yarn.nodemanager.resource.memory-mb : 14336 (14GB)
yarn.nodemanager.resource.cpu-vcores : 6
And, to create containers with 2GB and 1vcore each, set these properties:
yarn.scheduler.maximum-allocation-mb : 2049
yarn.scheduler.maximum-allocation-vcores : 1
Note: Even though there is enough memory(14gb) to create 7 containers with 2GB, above config will only create 6 containers with 2GB and only 12GB out of 14GB will be utilized to the cluster. This is because there are only 6vcores available to the cluster.
Now on Spark side,
Below properties specify memory to be requested per executor/container
spark.driver.memory
spark.executor.memory
Below properties specify vcores to be requested per executor/container
spark.driver.cores
spark.executor.cores
IMP: All the Spark's memory and vcore properties should be less than or equal to what YARN's configuration
Below property specifies the total number of executors/containers that can be used for your spark application from the YARN cluster.
spark.executor.instances
This property should be less than the total number of containers available in the YARN cluster.
Once the yarn configuration is complete, the spark should request for containers that can be allocated based on the YARN configurations. That means if YARN is configured to allocate a maximum of 2GB per container and Spark requests a container with 3GB memory, then the job will either halt or stop because YARN cannot satisfy the spark's request.
Now for your use case: Usually, cluster tuning is based on the workloads. But below config should be more suitable.
Memory available: 10GB * 2 nodes Vcores available: 5 * 2 vcores [Assumption]
On yarn-site.xml [In both the nodes]
yarn.nodemanager.resource.memory-mb
: 10240
yarn.nodemanager.resource.cpu-vcores
: 5
yarn.scheduler.maximum-allocation-mb
: 2049
yarn.scheduler.maximum-allocation-vcores
: 1
Using above config, you can create a maximum of 10 containers on each of the nodes having 2GB,1vcore per container.
Spark config
spark.driver.memory
1536mb
spark.yarn.executor.memoryOverhead
512mb
spark.executor.memory
1536mb
spark.yarn.executor.memoryOverhead
512mb
spark.driver.cores
1
spark.executor.cores
1
spark.executor.instances
19
Please feel free to play around these configurations to suit your needs.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With