With Dynamic Resource Allocation :
A Spark application removes an executor when it has been idle for more than spark.dynamicAllocation.executorIdleTimeout seconds.
And when I set the executor idle timeout property
as followed spark.dynamicAllocation.executorIdleTimeout= 300
, it throws the following warning :
spark.ExecutorAllocationManager: Removing executor 0 because it has been idle for 300 seconds (new desired total will be 2)
What does mean to be "Idle"? Does it means the worker use no CPU? Does a blocking call to database count as Idle?
What does mean to be "Idle"?
To answer this question, I'd like to go back to the official documentation. So, like mentioned in the documentation that you have cited, the dynamic allocation mechanism provides Spark with the ability to adjust dynamically the resources your application occupies based on the workload.
This means that your application may :
Request them again later when needed.
Give resources back to the cluster if they are no longer used: The executor state is considered as idle, which is the term to specify that it's sitting there doing nothing (and reserving resources) i.e : in your case, it generated the following warning :
spark.ExecutorAllocationManager: Removing executor 0 because it has been idle for 300 seconds (new desired total will be 2)
Does it means the worker uses no cpu?
An executor reserves CPU and memory. In case other applications need to use them, well it's reserved. Your resource manager can't allocate resources for your other applications. Thus freeing them can be prolific in case of multiple application sharing the same cluster.
Does a blocking call to database count as Idle?
A call to a database usually asks for resources, thus, the executor isn't idle when performing any kind of task (even this one).
To know more about the ExecutorAllocationManager
, I advice you to look into it's code here.
Idle means two things:
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With