I am working with following spark config
maxCores = 5
driverMemory=2g
executorMemory=17g
executorInstances=100
Issue: Out of 100 Executors, My job ends up with only 10 active executors, nonetheless enough memory is available. Even tried setting the executors to 250 only 10 remains active.All I am trying to do is loading a mulitpartition hive table and doing df.count over it.
Please help me understanding the issue causing the executors kill
17/12/20 11:08:21 ERROR executor.CoarseGrainedExecutorBackend: RECEIVED SIGNAL TERM
17/12/20 11:08:21 INFO storage.DiskBlockManager: Shutdown hook called
17/12/20 11:08:21 INFO util.ShutdownHookManager: Shutdown hook called
Not sure why yarn is killing my executors.
CoarseGrainedExecutorBackend is an ExecutorBackend to manage a single coarse-grained executor (that lives as long as the owning executor backend). CoarseGrainedExecutorBackend registers itself as a ThreadSafeRpcEndpoint under the name Executor to communicate with the driver.
Memory overhead is the amount of off-heap memory allocated to each executor. By default, memory overhead is set to either 10% of executor memory or 384, whichever is higher.
I faced a similar issue where the investigation of the NodeManager-Logs lead me to the root cause. You can access them via the Web-interface
nodeManagerAddress:PORT/logs
The PORT is specified in the yarn-site.xml under yarn.nodemanager.webapp.address. (default: 8042)
My Investigation-Workflow:
Btw: you can access the aggregated collection (xml) of all configurations affecting a node at the same port with:
nodeManagerAdress:PORT/conf
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With