why Spark is not distributing jobs to all executors, but to only one executer?

Question

My Spark cluster has 1 master and 3 workers (on 4 separate machines, each machine with 1 core), and other settings are as in the picture below, where spark.cores.max is set to 3, and spark.executor.cores also 3 (in pic-1)

But when I submit my job to Spark cluster, from the Spark web-UI I can see only one executor is used (according to used memory and RDD blocks in pic-2), but not all of the executors. In this case the processing speed is much slower than I expected.

Since I've set the max cores to be 3, shouldn't all the executors be used to this job?

How to configurate Spark to distribute current job to all executors, instead of only one executor running current job?

Thanks a lot.

------------------pic-1: spark settings

------------------pic-2: enter image description here

Lokesh Kumar P · Accepted Answer

You said you are running two receivers, what kind of Receivers are they (Kafka, Hdfs, Twitter ??)

Which spark version are you using?

In my experience, if you are using any Receiver other than file receiver, then it will occupy 1 core permanently. So when you say you have 2 receivers, then 2 cores will be permanently used for receiving the data, so you are left with only 1 core which is doing the work.

Please post the Spark master hompage screenshot as well. And Job's Streaming page screenshot.

why Spark is not distributing jobs to all executors, but to only one executer?

Tags:

performance

configuration

distributed-computing

apache-spark

spark-streaming

keypoint

1 Answers

Lokesh Kumar P

Recent Activity

Donate For Us

why Spark is not distributing jobs to all executors, but to only one executer?

Tags:

performance

configuration

distributed-computing

apache-spark

spark-streaming

keypoint

1 Answers

Lokesh Kumar P

Related questions

Recent Activity

Donate For Us