Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

The value of "spark.yarn.executor.memoryOverhead" setting?

The value of spark.yarn.executor.memoryOverhead in a Spark job with YARN should be allocated to App or just the max value?

like image 354
liyong Avatar asked Dec 09 '16 09:12

liyong


People also ask

What is Spark YARN executor memoryOverhead used for?

memoryOverhead property is added to the executor memory to determine the full memory request to YARN for each executor. It defaults to max(executorMemory * 0.10, with minimum of 384).

What is the default value for Spark executor cores?

executor. cores = 1 in YARN mode, all the available cores on the worker in standalone mode.

What is Spark YARN memoryOverhead?

spark. yarn. driver. memoryOverhead is the amount of off-heap memory (in megabytes) to be allocated per driver in cluster mode with the memory properties as the executor's memoryOverhead.

How is executor of Spark calculated?

Number of available executors = (total cores/num-cores-per-executor) = 150/5 = 30.


1 Answers

spark.yarn.executor.memoryOverhead

Is just the max value .The goal is to calculate OVERHEAD as a percentage of real executor memory, as used by RDDs and DataFrames

--executor-memory/spark.executor.memory

controls the executor heap size, but JVMs can also use some memory off heap, for example for interned Strings and direct byte buffers.

The value of the spark.yarn.executor.memoryOverhead property is added to the executor memory to determine the full memory request to YARN for each executor. It defaults to max(executorMemory * 0.10, with minimum of 384).

The executors will use a memory allocation based on the property of spark.executor.memoryplus an overhead defined by spark.yarn.executor.memoryOverhead

like image 54
Indrajit Swain Avatar answered Oct 17 '22 05:10

Indrajit Swain