What controls how much of a Spark Cluster is given to an application?

Tags:

In this page of the docs https://spark.apache.org/docs/latest/job-scheduling.html for static partitioning it says "With this approach, each application is given a maximum amount of resources it can use".

I was just wondering, what are these maximum resources? I found the memory per executor setting (mentioned just below in dynamic partitioning) this I assume limits the memory resource an application gets. But what decides how many executors are started / how many nodes from the cluster are used e.g. the total cluster memory and the cores that get "taken"?

On another similar note is there a way to change the memory asked for on a per job or task level?

234

asked Jan 14 '15 14:01

James k

1 Answers

The amount of resources depends on the cluster manager being used as different cluster managers will provide different allocation.

Eg In standalone mode, Spark will try to use all nodes. spark.max.cores will control how many cores in total the job will take across nodes. If not set, Spark will use spark.deploy.defaultCores. The documentation from spark.deploy.defaultCores further clarifies its use:

Default number of cores to give to applications in Spark's standalone mode if they don't set spark.cores.max. If not set, applications always get all available cores unless they configure spark.cores.max themselves. Set this lower on a shared cluster to prevent users from grabbing the whole cluster by default.

In Mesos coarse grained mode, Spark will allocate by default all available cores. Use spark.max.cores to limit that per job.

In Mesos fine-grained mode, Spark will allocate a core per task as needed by the job and release them afterwards. This ensures fair usage at the cost of higher task allocation overhead.

In YARN, per documentation:

The --num-executors option to the Spark YARN client controls how many executors it will allocate on the cluster, while --executor-memory and --executor-cores control the resources per executor.

Regarding memory, there's no way to set the total memory per job or task, only per executor, using spark.executor.memory. The memory assigned to your job will be spark.executor.memory x #executors.

133

answered Nov 25 '22 06:11

maasg

Related questions
                            
                                Where is a good collection of freely licensed instrument .wav samples?
                            
                                Use one Message as argument in other Spring Messages (properties file)
                            
                                Best practice for using resources in a WPF project
                            
                                Good place to start learning data warehousing? [closed]
                            
                                Programmatic resource monitoring per process in Linux
                            
                                How to access images from a resource folder in intelliJ IDEA without having to give entire path name
                            
                                How to use image resource in asp.net website?
                            
                                Eclipse bug with nine patch files?
                            
                                Rails: dynamically generated path is adding a period and the id at the end
                            
                                Android: How to combine Spannable.setSpan with String.format?
                            
                                <mvc:resources> how to use the classpath location
                            
                                How to use path variable instead of request parameter with AngularJS $resource
                            
                                Understanding resources in Visual Studio
                            
                                How to assign WPF resources to other resource tags
                            
                                Can getResourceAsStream() find files outside the jar file?
                            
                                How to fix the "508 Resource Limit is reached" error in WordPress?
                            
                                Why are some some resources in Java not garbage collected and must be closed or autoclosed?
                            
                                What's the best way to store and access XML in Android?
                            
                                How to reference an icon resource file reference in XAML
                            
                                Which channel type uses the least amount of memory in Go?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

What controls how much of a Spark Cluster is given to an application?

Tags:

resources

apache-spark

James k

People also ask

1 Answers

maasg

Recent Activity

Donate For Us