Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Apache Spark - Why are executor being removed? What does 'Idle' mean?

Tags:

apache-spark

With Dynamic Resource Allocation :

A Spark application removes an executor when it has been idle for more than spark.dynamicAllocation.executorIdleTimeout seconds.

And when I set the executor idle timeout property as followed spark.dynamicAllocation.executorIdleTimeout= 300 , it throws the following warning :

spark.ExecutorAllocationManager: Removing executor 0 because it has been idle for 300 seconds (new desired total will be 2)

What does mean to be "Idle"? Does it means the worker use no CPU? Does a blocking call to database count as Idle?

like image 544
Lorenzo Belli Avatar asked Jul 05 '17 09:07

Lorenzo Belli


2 Answers

What does mean to be "Idle"?

To answer this question, I'd like to go back to the official documentation. So, like mentioned in the documentation that you have cited, the dynamic allocation mechanism provides Spark with the ability to adjust dynamically the resources your application occupies based on the workload.

This means that your application may :

  • Request them again later when needed.

  • Give resources back to the cluster if they are no longer used: The executor state is considered as idle, which is the term to specify that it's sitting there doing nothing (and reserving resources) i.e : in your case, it generated the following warning :

    spark.ExecutorAllocationManager: Removing executor 0 because it has been idle for 300 seconds (new desired total will be 2)
    

Does it means the worker uses no cpu?

An executor reserves CPU and memory. In case other applications need to use them, well it's reserved. Your resource manager can't allocate resources for your other applications. Thus freeing them can be prolific in case of multiple application sharing the same cluster.

Does a blocking call to database count as Idle?

A call to a database usually asks for resources, thus, the executor isn't idle when performing any kind of task (even this one).

To know more about the ExecutorAllocationManager, I advice you to look into it's code here.

like image 95
eliasah Avatar answered Nov 03 '22 00:11

eliasah


Idle means two things:

  • No active stages on this executor
  • No (explicitly) persisted data on this executor
like image 33
Rick Moritz Avatar answered Nov 03 '22 02:11

Rick Moritz