Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Spark worker memory

Tags:

apache-spark

I have setup a spark (1.6)standalone cluster. have 1 master and added 3 machines under conf/slaves file as workers. Even though I have allocated 4GB memory to each of my workers in spark, why does it use only 1024MB when the application is running? I would like for it use all 4 GB allocated to it. Help me figure out where and what I am doing wrong.

Below is the screenshot of the spark master page (when the application is running using spark-submit) where under the Memory column it shows 1024.0 MB used in brackets next to 4.0 GB.

I also tried setting --executor-memory 4G option with spark-submit and it does not work (as suggested in How to change memory per node for apache spark worker).

These are the options I have set in spark-env.sh file

export SPARK_WORKER_CORES=3

export SPARK_WORKER_MEMORY=4g

export SPARK_WORKER_INSTANCES=2

enter image description here

like image 256
B1K Avatar asked Apr 04 '16 17:04

B1K


People also ask

What is Spark Python worker memory?

spark.python.worker.memory. 512m. Amount of memory to use per python worker process during aggregation, in the same format as JVM memory strings with a size unit suffix ("k", "m", "g" or "t") (e.g. 512m , 2g ). If the memory used during aggregation goes above this amount, it will spill the data into disks.

How much memory should I give my Spark driver?

110 GB RAM (10 GB per node)

How does Spark work in memory?

In-memory cluster computation enables Spark to run iterative algorithms, as programs can checkpoint data and refer back to it without reloading it from disk; in addition, it supports interactive querying and streaming data analysis at extremely fast speeds.

What is Spark user memory?

User Memory is the memory used to store user-defined data structures, Spark internal metadata, any UDFs created by the user, the data needed for RDD conversion operations such as the information for RDD dependency information etc.


1 Answers

One other workaround is try setting the following parameters inside the conf/spark-defaults.conf file:

spark.driver.cores              4
spark.driver.memory             2g
spark.executor.memory           4g

Once you set the above (only the last line in your case) shut down all the workers at once and restart them. It is better to initialize the executors memory this way since your problem seems to be that no executor can allocate all the available memory of his worker.

like image 114
raschild Avatar answered Sep 20 '22 13:09

raschild