I have setup a spark (1.6)standalone cluster. have 1 master and added 3 machines under conf/slaves file as workers. Even though I have allocated 4GB memory to each of my workers in spark, why does it use only 1024MB when the application is running? I would like for it use all 4 GB allocated to it. Help me figure out where and what I am doing wrong. Below is the screenshot of the spark master page (when the application is running using spark-submit) where under the Memory column it shows 1024.0 MB used in brackets next to 4.0 GB. I also tried setting --executor-memory 4G option with spark-submit and it does not work (as suggested in How to change memory per node for apache spark worker). These are the options I have set in spark-env.sh file export SPARK_WORKER_CORES=3 export SPARK_WORKER_MEMORY=4g export SPARK_WORKER_INSTANCES=2 <img src="https://i.stack.imgur.com/9j8v4.png" alt="enter image description here">

One other workaround is try setting the following parameters inside the <code>conf/spark-defaults.conf</code> file: <pre class="prettyprint"><code>spark.driver.cores 4 spark.driver.memory 2g spark.executor.memory 4g </code></pre> Once you set the above (only the last line in your case) shut down all the workers at once and restart them. It is better to initialize the executors memory this way since your problem seems to be that no executor can allocate all the available memory of his worker.

Spark worker memory

Tags:

apache-spark

I have setup a spark (1.6)standalone cluster. have 1 master and added 3 machines under conf/slaves file as workers. Even though I have allocated 4GB memory to each of my workers in spark, why does it use only 1024MB when the application is running? I would like for it use all 4 GB allocated to it. Help me figure out where and what I am doing wrong.

Below is the screenshot of the spark master page (when the application is running using spark-submit) where under the Memory column it shows 1024.0 MB used in brackets next to 4.0 GB.

I also tried setting --executor-memory 4G option with spark-submit and it does not work (as suggested in How to change memory per node for apache spark worker).

These are the options I have set in spark-env.sh file

export SPARK_WORKER_CORES=3

export SPARK_WORKER_MEMORY=4g

export SPARK_WORKER_INSTANCES=2

enter image description here

256

asked Apr 04 '16 17:04

B1K

1 Answers

One other workaround is try setting the following parameters inside the conf/spark-defaults.conf file:

spark.driver.cores              4
spark.driver.memory             2g
spark.executor.memory           4g

Once you set the above (only the last line in your case) shut down all the workers at once and restart them. It is better to initialize the executors memory this way since your problem seems to be that no executor can allocate all the available memory of his worker.

114

answered Sep 20 '22 13:09

raschild

Related questions
                            
                                Spark 2.3.1 Structured Streaming state store inner working
                            
                                Unable to read keystore file from pyspark
                            
                                How to More Efficiently Load Parquet Files in Spark (pySpark v1.2.0)
                            
                                What operations contribute to Spark Task Deserialization time?
                            
                                How to modify a Spark Dataframe with a complex nested structure?
                            
                                Distributed cross correlation matrix computation
                            
                                SBT test does not work for spark test
                            
                                Creating parquet files in spark with row-group size that is less than 100
                            
                                Spark/PySpark: An error occurred while trying to connect to the Java server (127.0.0.1:39543)
                            
                                why does filter remove null value by default on spark dataframe?
                            
                                Memory issue with spark structured streaming
                            
                                Storing multiple dataframes of different widths with Parquet?
                            
                                Does spark optimize identical but independent DAGs in pyspark?
                            
                                Spark fails on big shuffle jobs with java.io.IOException: Filesystem closed
                            
                                Combine results from batch RDD with streaming RDD in Apache Spark
                            
                                real time log processing using apache spark streaming
                            
                                Spark streaming DStream RDD to get file name
                            
                                Create Spark DataFrame in Spark Streaming from JSON Message on Kafka
                            
                                Spark forcing log4j
                            
                                Accessing HDFS HA from spark job (UnknownHostException error)

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With