Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why does spark kill tasks?

I'm running a spark computational application and I regularly run into some issue with task killing. Here is how it looks like in my spark console:

enter image description here

As can be seen there are some jobs with the description (_num_ killed: another attempt succeeded). This is not just failed, this is something different. Can someone explain what is it?

like image 219
St.Antario Avatar asked Nov 18 '25 07:11

St.Antario


2 Answers

If a task appears to be taking an unusually long time to complete, Spark may launch extra duplicate copies of that task in case they can complete sooner. This is referred to as speculation or speculative execution. If one copy succeeds, the others can be killed.

See the parameters starting with spark.speculation here: https://spark.apache.org/docs/latest/configuration.html

like image 183
Joe K Avatar answered Nov 20 '25 21:11

Joe K


Killed - It means that that executor is killed by an the Worker who stopped and asked to kill the executor. This situation can be because of many reasons like by some user driven action or may be your executor finished processing but due to some reason it does not exists but worker is exiting so it needs to Kill the executor.

like image 20
vaquar khan Avatar answered Nov 20 '25 21:11

vaquar khan



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!