My Spark cluster has 1 master and 3 workers (on 4 separate machines, each machine with 1 core), and other settings are as in the picture below, where spark.cores.max is set to 3, and spark.executor.cores also 3 (in pic-1)
But when I submit my job to Spark cluster, from the Spark web-UI I can see only one executor is used (according to used memory and RDD blocks in pic-2), but not all of the executors. In this case the processing speed is much slower than I expected.
Since I've set the max cores to be 3, shouldn't all the executors be used to this job?
How to configurate Spark to distribute current job to all executors, instead of only one executor running current job?
Thanks a lot.
------------------pic-1:
------------------pic-2:
You said you are running two receivers, what kind of Receivers are they (Kafka, Hdfs, Twitter ??)
Which spark version are you using?
In my experience, if you are using any Receiver other than file receiver, then it will occupy 1 core permanently. So when you say you have 2 receivers, then 2 cores will be permanently used for receiving the data, so you are left with only 1 core which is doing the work.
Please post the Spark master hompage screenshot as well. And Job's Streaming page screenshot.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With